Kiyoshi Ueda wrote: > This patch adds core functions for request-based dm. > > When struct mapped device (md) is initialized, md->queue has > an I/O scheduler and the following functions are used for > request-based dm as the queue functions: > make_request_fn: dm_make_request() > pref_fn: dm_prep_fn() > request_fn: dm_request_fn() > softirq_done_fn: dm_softirq_done() > lld_busy_fn: dm_lld_busy() > Actual initializations are done in another patch (PATCH 3). > > Below is a brief summary of how request-based dm behaves, including: > - making request from bio > - cloning, mapping and dispatching request > - completing request and bio > - suspending md > - resuming md > > > bio to request > ============== > md->queue->make_request_fn() (dm_make_request()) calls__make_request() > for a bio submitted to the md. > Then, the bio is kept in the queue as a new request or merged into > another request in the queue if possible. > > > Cloning and Mapping > =================== > Cloning and mapping are done in md->queue->request_fn() (dm_request_fn()), > when requests are dispatched after they are sorted by the I/O scheduler. > > dm_request_fn() checks busy state of underlying devices using > target's busy() function and stops dispatching requests to keep them > on the dm device's queue if busy. > It helps better I/O merging, since no merge is done for a request > once it is dispatched to underlying devices. > > Actual cloning and mapping are done in dm_prep_fn() and map_request() > called from dm_request_fn(). > dm_prep_fn() clones not only request but also bios of the request > so that dm can hold bio completion in error cases and prevent > the bio submitter from noticing the error. > (See the "Completion" section below for details.) > > After the cloning, the clone is mapped by target's map_rq() function > and inserted to underlying device's queue using > blk_insert_cloned_request(). > > > Completion > ========== > Request completion can be hooked by rq->end_io(), but then, all bios > in the request will have been completed even error cases, and the bio > submitter will have noticed the error. > To prevent the bio completion in error cases, request-based dm clones > both bio and request and hooks both bio->bi_end_io() and rq->end_io(): > bio->bi_end_io(): end_clone_bio() > rq->end_io(): end_clone_request() > > Summary of the request completion flow is below: > blk_end_request() for a clone request > => __end_that_request_first() > => bio->bi_end_io() == end_clone_bio() for each clone bio > => Free the clone bio > => Success: Complete the original bio (blk_update_request()) > Error: Don't complete the original bio > => end_that_request_last() > => rq->end_io() == end_clone_request() > => blk_complete_request() > => dm_softirq_done() > => Free the clone request > => Success: Complete the original request (blk_end_request()) > Error: Requeue the original request > > end_clone_bio() completes the original request on the size of > the original bio in successful cases. > Even if all bios in the original request are completed by that > completion, the original request must not be completed yet to keep > the ordering of request completion for the stacking. > So end_clone_bio() uses blk_update_request() instead of > blk_end_request(). > In error cases, end_clone_bio() doesn't complete the original bio. > It just frees the cloned bio and gives over the error handling to > end_clone_request(). > > end_clone_request(), which is called with queue lock held, completes > the clone request and the original request in a softirq context > (dm_softirq_done()), which has no queue lock, to avoid a deadlock > issue on submission of another request during the completion: > - The submitted request may be mapped to the same device > - Request submission requires queue lock, but the queue lock > has been held by itself and it doesn't know that > > The clone request has no clone bio when dm_softirq_done() is called. > So target drivers can't resubmit it again even error cases. > Instead, they can ask dm core for requeueing and remapping > the original request in that cases. > > > suspend > ======= > Request-based dm uses stopping md->queue as suspend of the md. > For noflush suspend, just stops md->queue. > > For flush suspend, inserts a marker request to the tail of md->queue. > And dispatches all requests in md->queue until the marker comes to > the front of md->queue. Then, stops dispatching request and waits > for the all dispatched requests to complete. > After that, completes the marker request, stops md->queue and > wake up the waiter on the suspend queue, md->wait. > > > resume > ====== > Starts md->queue. > > > Signed-off-by: Kiyoshi Ueda <k-ueda@xxxxxxxxxxxxx> > Signed-off-by: Jun'ichi Nomura <j-nomura@xxxxxxxxxxxxx> Acked-by: Hannes Reinecke <hare@xxxxxxx> Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@xxxxxxx +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel