Re: [PATCH] scsi, fcoe, libfc: drop scsi host_lock use from fc_queuecommand

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2010-09-26 01:55, Bart Van Assche wrote:
> On Fri, Sep 24, 2010 at 8:41 AM, Jens Axboe <jaxboe@xxxxxxxxxxxx> wrote:
>>
>> [ ... ]
>>
>> Bart, can you try with this patchset added:
>>
>> git://git.kernel.dk/linux-2.6-block.git blk-alloc-optimize
>>
>> It's a work in progress and not suitable for general consumption yet,
>> but it's tested working at least. There will be more built on top of
>> this, but at least even this simple stuff is making a big difference
>> for IOPS testing for me.
> 
> Hello Jens,
> 
> Thanks for the feedback. I see a nice 10% speedup after having applied
> the four block layer optimization patches from the blk-alloc-optimize
> branch on an already patched 2.6.35.5 SRP initiator.

Great! Not too bad for something that's will a WIP.

> Note: according to the output of perf record -g, most spinlock calls
> still originate from the block layer. This is what the perf tool
> reported for a fio run using libaio with small blocks (512 bytes):
> 
> Event: cycles
> -      7.06%    fio  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
>    - _raw_spin_lock_irqsave
>       + 19.51% blk_run_queue
>       + 13.71% blk_end_bidi_request
>       + 10.04% mlx4_ib_poll_cq
>       + 4.68% lock_timer_base
>       + 4.22% aio_complete
>       + 3.97% srp_send_completion
>       + 3.71% srp_queuecommand
>       + 3.55% dio_bio_end_aio
>       + 3.37% __srp_get_tx_iu
>       + 3.14% srp_recv_completion
>       + 3.00% scsi_device_unbusy
>       + 2.87% __scsi_put_command
>       + 2.82% __blockdev_direct_IO_newtrunc
>       + 2.76% scsi_put_command
>       + 2.69% scsi_run_queue
>       + 2.65% dio_bio_submit
>       + 2.54% srp_remove_req
>       + 2.46% mlx4_ib_post_send
>       + 2.33% scsi_get_command
>       + 1.95% mlx4_ib_post_recv

One piece of low hanging fruit is reducing the number of queue runs.
SCSI does this for every completed command to keep the device queue
full. I bet if you try an experiement where you only run the queue when
a certain number of requests have completed, you would greatly reduce
scsi_run_queue and blk_run_queue in the above profile.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux