RE: How to make kernel block layer generate bigger request in the request queue?

"Gao, Yunpeng" <yunpeng.gao@xxxxxxxxx> · Mon, 19 Apr 2010 14:42:38 +0800

Thanks a lot to Alan for this suggestion. I think it makes sense to simulate a scatter gather in driver for this case. I'll try it later and expect to see the improved performance.

>-----Original Message-----
>From: Alan Cox [mailto:alan@xxxxxxxxxxxxxxxxxxx]
>Sent: 2010年4月13日 23:21
>To: Gao, Yunpeng
>Cc: James Bottomley; Martin K. Petersen; Robert Hancock;
>linux-ide@xxxxxxxxxxxxxxx; linux-mmc@xxxxxxxxxxxxxxx
>Subject: Re: How to make kernel block layer generate bigger request in the
>request queue?
>
>> And I just curious why the block layer does not merge these contiguous sectors
>into one single request? For example, if > the block layer generate 'start_sect:
>48776, nsect: 64, rw: r' instead of below requests, I think the performance will
>> be better.
>
>You said earlier "My hardware doesn't support scatter/gather"
>
>> start_sect: 48776, nsect: 8, rw: r
>> start_sect: 48784, nsect: 8, rw: r
>> start_sect: 48792, nsect: 8, rw: r
>> start_sect: 48800, nsect: 8, rw: r
>> start_sect: 48808, nsect: 8, rw: r
>> start_sect: 48816, nsect: 8, rw: r
>> start_sect: 48824, nsect: 8, rw: r
>> start_sect: 48832, nsect: 8, rw: r
>
>Print the bus address of each request and you will probably find they are
>not contiguous so they have not been merged because your hardware could
>not do that transfer and you have no IOMMU.
>
>If the overhead per command is really really huge you can preallocate an
>internal buffer of say 32K or 64K in your driver and tell the block layer
>you do scatter gather, then copy the buffers into a linear chunk. I'd be
>very surprised if that was a win overall on any vaguely sane hardware but
>flash with erase block overhead and the like might be one of the less
>sane cases.
>
>Alan
?韬{.n?????%??檩??w?{.n???{炳'^??骅w*jg????????G??⒏⒎?:+v????????????＂??????