On Thu, 2012-07-05 at 20:01 -0700, Nicholas A. Bellinger wrote: > So I'm pretty sure this discrepancy is attributed to the small block > random I/O bottleneck currently present for all Linux/SCSI core LLDs > regardless of physical or virtual storage fabric. > > The SCSI wide host-lock less conversion that happened in .38 code back > in 2010, and subsequently having LLDs like virtio-scsi convert to run in > host-lock-less mode have helped to some extent.. But it's still not > enough.. > > Another example where we've been able to prove this bottleneck recently > is with the following target setup: > > *) Intel Romley production machines with 128 GB of DDR-3 memory > *) 4x FusionIO ioDrive 2 (1.5 TB @ PCI-e Gen2 x2) > *) Mellanox PCI-exress Gen3 HCA running at 56 gb/sec > *) Infiniband SRP Target backported to RHEL 6.2 + latest OFED > > In this setup using ib_srpt + IBLOCK w/ emulate_write_cache=1 + > iomemory_vsl export we end up avoiding SCSI core bottleneck on the > target machine, just as with the tcm_vhost example here for host kernel > side processing with vhost. > > Using Linux IB SRP initiator + Windows Server 2008 R2 SCSI-miniport SRP > (OFED) Initiator connected to four ib_srpt LUNs, we've observed that > MSFT SCSI is currently outperforming RHEL 6.2 on the order of ~285K vs. > ~215K with heavy random 4k WRITE iometer / fio tests. Note this with an > optimized queue_depth ib_srp client w/ noop I/O schedulering, but is > still lacking the host_lock-less patches on RHEL 6.2 OFED.. > > This bottleneck has been mentioned by various people (including myself) > on linux-scsi the last 18 months, and I've proposed that that it be > discussed at KS-2012 so we can start making some forward progress: Well, no, it hasn't. You randomly drop things like this into unrelated email (I suppose that is a mention in strict English construction) but it's not really enough to get anyone to pay attention since they mostly stopped reading at the top, if they got that far: most people just go by subject when wading through threads initially. But even if anyone noticed, a statement that RHEL6.2 (on a 2.6.32 kernel, which is now nearly three years old) is 25% slower than W2k8R2 on infiniband isn't really going to get anyone excited either (particularly when you mention OFED, which usually means a stack replacement on Linux anyway). What people might pay attention to is evidence that there's a problem in 3.5-rc6 (without any OFED crap). If you're not going to bother investigating, it has to be in an environment they can reproduce (so ordinary hardware, not infiniband) otherwise it gets ignored as an esoteric hardware issue. James -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html