On Mon, Jul 04, 2016 at 06:47:29PM +0800, Ming Lei wrote: > >> One clean solution may be to convert the loop of generic_make_request() > >> into the following way: > >> > >> do { > >> struct bio *splitted, *remainder = NULL; > >> struct request_queue *q = bdev_get_queue(bio->bi_bdev); > >> > >> blk_queue_split(q, &bio, &remainder, q->bio_split); > >> > >> ret = q->make_request_fn(q, bio); > >> > >> if (remainder) > >> bio_list_add(current->bio_list, remainder); > >> bio = bio_list_pop(current->bio_list); > >> } while (bio) > > > > Not good enough. > > Goal was to first process all "deeper level" bios > > before processing the remainder. > > For the reported bio splitting issue, I think the goal is to make sure all > BIOs generated from 'bio' inside .make_request_fn(bio) are queued > before the 'current' remainder. Cause it is the issue introduced by > blk_split_bio(). Stacking: qA (limitting layer) -> qB (remapping) -> qC (remapping) -> qD (hardware) [in fact you don't even need four layers, this is just to clarify that the stack may be more complex than you assume it would be] Columns below: 1) argument to generic_make_request() and its target queue. 2) current->bio_list 3) "in-flight" counter of qA. ==== In your new picture, iiuc: generic_make_request(bio_orig) NULL in-flight=0 bio_orig empty blk_queue_split() result: bio_s, bio_r qA->make_request_fn(bio_s) in-flight=1 bio_c = bio_clone(bio_s) generic_make_request(bio_c to qB) bio_c <-return bio_list_add(bio_r) bio_c, bio_r bio_list_pop() bio_r qB->make_request_fn(bio_c) (Assume it does not clone, but only remap. But it may also be a striping layer, and queue more than one bio here.) generic_make_request(bio_c to qC) bio_r, bio_c <-return bio_list_pop() bio_c qA->make_request_fn(bio_r) in-flight still 1 BLOCKS, because bio_c has not been processed to its final destination qD yet, and not dispatched to hardware. ==== my suggestion generic_make_request(bio_orig) NULL in-flight=0 bio_orig empty in-flight=0 qA->make_request_fn(bio_orig) blk_queue_split() result: bio_s, and bio_r stuffed away to head of remainder list. in-flight=1 bio_c = bio_clone(bio_s) generic_make_request(bio_c to qB) bio_c <-return bio_c bio_list_pop() empty qB->make_request_fn(bio_c) (Assume it does not clone, but only remap. But it may also be a striping layer, and queue more than one bio here.) generic_make_request(bio_c to qC) bio_c <-return bio_list_pop() empty qC->make_request_fn(bio_c) generic_make_request(bio_c to qD) bio_c <-return bio_list_pop() empty qD->make_request_fn(bio_c) dispatches to hardware <-return empty bio_list_pop() NULL, great, lets pop from remainder list qA->make_request_fn(bio_r) in-flight=? May block, but only until completion of bio_c. Which may already have happened. *makes progress* Thanks, Lars -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html