On Tue, Nov 10, 2009 at 3:35 PM, Martin K. Petersen <martin.petersen@xxxxxxxxxx> wrote: >>>>>> "Chris" == Chris Worley <worleys@xxxxxxxxx> writes: > > Chris> I'm not saying the SCSI protocol is bad, I'm saying the > Chris> SAS/SATA/SCSI controllers, that have been optimized for years for > Chris> rotating media, don't have the compute power to handle the sort > Chris> of performance attainable with SSS. > > And I'm saying that at least in the SCSI case that's untrue. SAS and FC > controllers are optimized for lots and lots of I/O because their main > application is driving large storage arrays which have performance > comparable to the solid state devices you mention. We're going to have to agree to disagree on this. My feeling is, you haven't tried the next generation in I/O performance, only the slow SSD's currently available, and don't (yet) see the potential for getting rid of all the hardware layers that evolved around rotating media. And when you talk of FC... there's more performance inhibition. Slow hardware like 10G Ethernet and FC8 can't keep up with the performance required for fast SSS I/O. A single QDR IB port is a good start, with 3GB/s per port (measured using SRP to export the drives). How many FC8 or 10G over iSCSI ports would it take to get one QDR IB ports performance(?)... then start thinking 2x, 4x, 8x, ... and the complexity of the old hardware becomes daunting when trying to scale. Again, to scale easily and with less complexity: you need your fundamental components to be fast. SSD's, FC, 10G are last generation hardware and way too slow. You can use a 90's vintage distributed supercomputer with 100's of processors to run tasks that one CPU can do as fast or faster today... but many would agree that the new CPU is probably an easier choice. And again, I'm not attacking the SCSI protocol, just the controller performance; but getting rid of unnecessary OS software layers (i.e. when you can directly use a block device), also provides more performance. When you've got <50usecs latency storage, every CPU cycle counts. <snip> > > Chris> I'd run IOZone and fill the drive (as I recall ~200GB) w/ files > Chris> and benchmark, which, at the end, IOZone would delete all the > Chris> files created (in the hundreds), and the delete/discard process > Chris> was no more time consuming than just the delete process (for > Chris> everything on the drive). This was w/ the original 2.6.27 and > Chris> 2.6.28 ext4 "discard" implementations. > > And which device was this? How did it implement discard? This was a non-GPL driver (as is the management layer for all SSD's), so I doubt you're interested. The methodology used was that laid out by David Wodhouse in: http://lwn.net/Articles/293658/ Basically: 1) register for the discard, 2) decode the write BIO's that indicate "discard", 3) send completion when done. Chris > > -- > Martin K. Petersen Oracle Linux Engineering > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html