On Sat, Sep 4, 2010 at 12:27 AM, Vasu Dev <vasu.dev@xxxxxxxxx> wrote: > > Currently fc_queuecommand drops this lock very early on and then re-acquires > this lock just before return, this re-acquired lock gets dropped immediately > by its fast path caller scsi_dispatch_cmd, this re-acquire and then immediate > drop on each IO hits performance especially with small size IOs on multi-core > systems, this hit is significant about 25% with 512 bytes size IOs. > > This lock is not needed in fc_queuecommand and calling fc_queuecommand > without this lock held removes above described performance hit. > > So this patch adds unlocked_qcmds flag to drop host_lock before > calling only fc_queuecommand and removes re-acquire & then drop > for each IO. Added flag, drops this lock only if LLD wants such > as fcoe.ko does here for fc_queuecommand. This change won't affect > any existing LLD since added unlocked_qcmds flag will be zero > in those cases and their host lock uses would effectively remain > same after this patch. [ ... ] Hello Vasu, While I don't doubt that your patch improves performance, there might be more bottlenecks in the Linux storage stack than the SCSI host lock. After having converted the SRP initiator (ib_srp) to use per-LUN locking instead of per-host locking and after having enabled unlocked_qcmds, I noticed that the largest source of contention was in the block layer. There was ten times more contention on request_queue.queue_lock than on any other spinlock. As you can see below, the four most contended locks in the test I ran were: * request_queue.queue_lock, needed by __make_request() in blk-core.c. * srp_target_port.lock, the per-LUN lock in ib_srp.c. * The SCSI host lock. * An AIO lock. The test I ran used libaio and the NOOP scheduler. Block merging was disabled via sysfs. >From /proc/lock_stat: lock_stat version 0.3 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- class name con-bounces contentions waittime-min waittime-max waittime-total acq-bounces acquisitions holdtime-min holdtime-max holdtime-total ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- &(&q->__queue_lock)->rlock: 11093047 11093901 0.22 9.37 10163231.15 41697489 74851829 0.00 21.14 60656375.82 -------------------------- &(&q->__queue_lock)->rlock 2093562 [<ffffffff811af406>] __make_request+0x56/0x500 &(&q->__queue_lock)->rlock 2 [<ffffffff811bcd70>] cfq_get_queue+0xf0/0x2a0 &(&q->__queue_lock)->rlock 1 [<ffffffff811bd3e4>] cfq_set_request+0x414/0x500 &(&q->__queue_lock)->rlock 36 [<ffffffff811bd049>] cfq_set_request+0x79/0x500 -------------------------- &(&q->__queue_lock)->rlock 17 [<ffffffff811bd049>] cfq_set_request+0x79/0x500 &(&q->__queue_lock)->rlock 1007349 [<ffffffff811af406>] __make_request+0x56/0x500 &(&q->__queue_lock)->rlock 3036167 [<ffffffff811af45f>] __make_request+0xaf/0x500 &(&q->__queue_lock)->rlock 1 [<ffffffff811bcd70>] cfq_get_queue+0xf0/0x2a0 ............................................................................................................................................................................................... &(&target->lock)->rlock: 1098815 1098945 0.21 3.67 317746.64 16107020 48401787 0.00 14.65 11997072.82 ----------------------- &(&target->lock)->rlock 220238 [<ffffffffa04e1307>] __srp_get_tx_iu+0x47/0xd0 [ib_srp] &(&target->lock)->rlock 366255 [<ffffffffa04e2ee0>] srp_queuecommand+0xb0/0xad0 [ib_srp] &(&target->lock)->rlock 297993 [<ffffffffa04e25b6>] srp_remove_req+0x46/0x80 [ib_srp] &(&target->lock)->rlock 114850 [<ffffffffa04e2ae2>] srp_handle_recv+0x182/0x420 [ib_srp] ----------------------- &(&target->lock)->rlock 277148 [<ffffffffa04e2ee0>] srp_queuecommand+0xb0/0xad0 [ib_srp] &(&target->lock)->rlock 142360 [<ffffffffa04e2ae2>] srp_handle_recv+0x182/0x420 [ib_srp] &(&target->lock)->rlock 245220 [<ffffffffa04e1307>] __srp_get_tx_iu+0x47/0xd0 [ib_srp] &(&target->lock)->rlock 266847 [<ffffffffa04e25b6>] srp_remove_req+0x46/0x80 [ib_srp] ............................................................................................................................................................................................... &(shost->host_lock)->rlock: 989776 989903 0.21 5.78 271707.74 20089204 38990409 0.00 7.07 9908160.13 -------------------------- &(shost->host_lock)->rlock 123198 [<ffffffffa0037583>] scsi_request_fn+0x163/0x4d0 [scsi_mod] &(shost->host_lock)->rlock 235047 [<ffffffffa003036c>] scsi_dispatch_cmd+0x11c/0x3e0 [scsi_mod] &(shost->host_lock)->rlock 389901 [<ffffffffa00383cd>] scsi_device_unbusy+0x3d/0xd0 [scsi_mod] &(shost->host_lock)->rlock 241762 [<ffffffffa0036824>] scsi_run_queue+0x54/0x3d0 [scsi_mod] -------------------------- &(shost->host_lock)->rlock 107827 [<ffffffffa0037583>] scsi_request_fn+0x163/0x4d0 [scsi_mod] &(shost->host_lock)->rlock 95756 [<ffffffffa003036c>] scsi_dispatch_cmd+0x11c/0x3e0 [scsi_mod] &(shost->host_lock)->rlock 316122 [<ffffffffa0036824>] scsi_run_queue+0x54/0x3d0 [scsi_mod] &(shost->host_lock)->rlock 470203 [<ffffffffa00383cd>] scsi_device_unbusy+0x3d/0xd0 [scsi_mod] ............................................................................................................................................................................................... &(&ctx->ctx_lock)->rlock: 399855 399868 0.22 4.22 143530.86 7351017 50104676 0.00 8.92 14706301.05 ------------------------ &(&ctx->ctx_lock)->rlock 194753 [<ffffffff81158805>] aio_complete+0x45/0x260 &(&ctx->ctx_lock)->rlock 67287 [<ffffffff81159434>] do_io_submit+0x214/0x7b0 &(&ctx->ctx_lock)->rlock 56529 [<ffffffff81157da5>] __aio_get_req+0x95/0x1a0 &(&ctx->ctx_lock)->rlock 54461 [<ffffffff8115785d>] aio_put_req+0x2d/0x60 ------------------------ &(&ctx->ctx_lock)->rlock 205740 [<ffffffff81158805>] aio_complete+0x45/0x260 &(&ctx->ctx_lock)->rlock 125933 [<ffffffff8115785d>] aio_put_req+0x2d/0x60 &(&ctx->ctx_lock)->rlock 37042 [<ffffffff81157da5>] __aio_get_req+0x95/0x1a0 &(&ctx->ctx_lock)->rlock 29197 [<ffffffff81159434>] do_io_submit+0x214/0x7b0 ............................................................................................................................................................................................... [ ... ] Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html