On 18/05/17 11:21, Linus Walleij wrote: > On Tue, May 16, 2017 at 1:54 PM, Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote: >> On 10/05/17 11:24, Linus Walleij wrote: >>> The mmc_queue_req is a per-request state container the MMC core uses >>> to carry bounce buffers, pointers to asynchronous requests and so on. >>> Currently allocated as a static array of objects, then as a request >>> comes in, a mmc_queue_req is assigned to it, and used during the >>> lifetime of the request. >>> >>> This is backwards compared to how other block layer drivers work: >>> they usally let the block core provide a per-request struct that get >>> allocated right beind the struct request, and which can be obtained >>> using the blk_mq_rq_to_pdu() helper. (The _mq_ infix in this function >>> name is misleading: it is used by both the old and the MQ block >>> layer.) >>> >>> The per-request struct gets allocated to the size stored in the queue >>> variable .cmd_size initialized using the .init_rq_fn() and >>> cleaned up using .exit_rq_fn(). >>> >>> The block layer code makes the MMC core rely on this mechanism to >>> allocate the per-request mmc_queue_req state container. >>> >>> Doing this make a lot of complicated queue handling go away. >> >> Isn't that at the expense of increased memory allocation. >> >> Have you compared the number of allocations? It looks to me like the block >> layer allocates a minimum of 4 requests in the memory pool which will >> increase if there are more in the I/O scheduler, plus 1 for flush. There >> are often 4 queues per eMMC (2x boot,RPMB and main area), so that is 20 >> requests minimum, up from 2 allocations previously. For someone using 64K >> bounce buffers, you have increased memory allocation by at least 18x64 = >> 1152k. However the I/O scheduler could allocate a lot more. > > That is not a realistic example. > > As pointed out in patch #1, bounce buffers are used on old systems > which have max_segs == 1. No modern hardware has that, > they all have multiple segments-capable host controllers and > often also DMA engines. > > Old systems with max_segs == 1 also have: > > - One SD or MMC slot > - No eMMC (because it was not yet invented in those times) > - So no RPMB or Boot partitions, just main area > > If you can point me to a system that has max_segs == 1 and an > eMMC mounted, I can look into it and ask the driver maintainers to > check if it disturbs them, but I think those simply do not exist. > >>> Doing this refactoring is necessary to move the ioctl() operations >>> into custom block layer requests tagged with REQ_OP_DRV_[IN|OUT] >> >> Obviously you could create a per-request data structure with only the >> reference to the IOCTL data, and without putting all the memory allocations >> there as well. > > Not easily, and this is the way all IDE, ATA, SCSI disks etc are > doing this so why would be try to be different and maintain a lot > of deviant code. > > The allocation of extra data is done by the block layer when issueing > blk_get_request() so trying to keep the old mechanism of a list of > struct mmc_queue_req and trying to pair these with incoming requests > inevitably means a lot of extra work, possibly deepening that list or > creating out-of-list extra entries and whatnot. > > It's better to do what everyone else does and let the core do this > allocation of extra data (tag) instead. I agree it is much nicer, but the extra bounce buffer allocations still seem gratuitous. Maybe we should allocate them as needed from a memory pool, instead of for every request.