On Sun, 2009-05-03 at 15:20 -0400, Jeff Garzik wrote: > [tangent...] > > Does make you wonder if a ->init_rq_fn() would be helpful, one that > could perform gfp_t allocations rather than GFP_ATOMIC? The idea being > to call ->init_rq_fn() almost immediately after creation of struct > request, by the struct request creator. Isn't that what the current prep_fn actually is? > I obviously have not thought in depth about this, but it does seem that > init_rq_fn(), called earlier in struct request lifetime, could eliminate > the need for ->prepare_flush, ->prepare_discard, and perhaps could be a > better place for some of the ->prep_rq_fn logic. It's hard to see how ... prep_rq_fn is already called pretty early ... almost as soon as the elevator has decided to spit out the request > The creator of struct request generally has more freedom to sleep, and > it seems logical to give low-level drivers a "fill in LLD-specific info" > hook BEFORE the request is ever added to a request_queue. Unfortunately it's not really possible to find a sleeping context in there: The elevators have to operate from the current elv_next_request() context, which, in most drivers can either be user or interrupt. The way the block layer is designed is to pull allocations up the stack much closer to the process (usually at the bio creation point) because that allows the elevators to operate even in memory starved conditions. If we pushed the allocation down into the request level, we'd need some type of threading (bad for performance) and the request processing would stall when some GFP_KERNEL allocation went out to lunch finding memory. The ideal for REQ_TYPE_DISCARD seems to be to force a page allocation tied to a bio when it's issued at the top. That way everyone has enough memory when it comes down the stack (both extents and WRITE SAME sector will fit into a page ... although only just for WRITE SAME on 4k sectors). James -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html