2011/10/7 Jeff Moyer <jmoyer@xxxxxxxxxx>: > Christophe Saout <christophe@xxxxxxxx> writes: > >> Hi Jeff, >> >>> Anyway, it would help a great deal if you could retrigger the failure >>> and provide the full failure output. You can get that by issuing the >>> 'dmesg' command and redirecting it to a file. >> >> Oh, sorry, yes, there's a line missing. >> >> Line 323 is this one: BUG_ON(!rq->bio || rq->bio != rq->biotail); > > OK, it turns out my testing was incomplete. I only tested targets that > had a write-through cache, so I didn't hit this problem. It reproduces > pretty easily with just multipath involved (no linear target on top) when > running against the right storage. > > So, here's a patch, but I don't have a full explanation for it just yet. > What I observed was that, on fsync, blkdev_issue_flush was called. > Eventually, the flush request gets cloned, and blk_insert_cloned_request > is called. This cloned request never actually gets issued to the > q->requst_fn (scsi_request_fn in my case). So, it may be that there is > no plug list for this, so the queue isn't goosed? I'll try to come up > with a better explanation, or Tejun may just know off the top of his > head what's going on. > > So, the patch works for me, but is very much just an RFC. > > Cheers, > Jeff > > Signed-off-by: Jeff Moyer <jmoyer@xxxxxxxxxx> > > diff --git a/block/blk-flush.c b/block/blk-flush.c > index 491eb30..7aa4736 100644 > --- a/block/blk-flush.c > +++ b/block/blk-flush.c > @@ -320,7 +320,7 @@ void blk_insert_flush(struct request *rq) > return; > } > > - BUG_ON(!rq->bio || rq->bio != rq->biotail); > + BUG_ON(rq->bio && rq->bio != rq->biotail); > > /* > * If there's data but flush is not necessary, the request can be > @@ -345,6 +345,12 @@ void blk_insert_flush(struct request *rq) > rq->end_io = flush_data_end_io; > > blk_flush_complete_seq(rq, REQ_FSEQ_ACTIONS & ~policy, 0); > + > + /* > + * A cloned empty flush needs a queue kick to make progress. > + */ > + if (!rq->bio) > + blk_run_queue_async(q); > } Looks the dm request based flush logic is broken. saved_make_request_fn __make_request blk_insert_flush but blk_insert_flush doesn't put the original request to list, instead, the q->flush_rq is in list. then dm_request_fn blk_peek_request dm_prep_fn clone_rq map_request blk_insert_cloned_request so q->flush_rq is cloned, and get dispatched. but we can't clone q->flush_rq and use it to do flush. map_request even could assign a different blockdev to the cloned request. Clone q->flush_rq is absolutely wrong. we can clear ogirinal request's flush bit and set cloned request flush bit, so we can skip the blk_insert_flush logic for original request. can you please try this patch. Only compilation tested. Thanks, Shaohua
Signed-off-by: Shaohua Li <shaohua.li@xxxxxxxxx> diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 52b39f3..07a930d 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1424,6 +1424,13 @@ static int dm_make_request(struct request_queue *q, struct bio *bio) { struct mapped_device *md = q->queuedata; + if (bio->bi_rw & (REQ_FLUSH | REQ_FUA)) { + if (bio->bi_rw & REQ_FLUSH) + bio->bi_rw |= REQ_CLONED_FLUSH; + if (bio->bi_rw & REQ_FUA) + bio->bi_rw |= REQ_CLONED_FUA; + bio->bi_rw &= ~(REQ_FLUSH|REQ_FUA); + } return md->saved_make_request_fn(q, bio); /* call __make_request() */ } @@ -1547,6 +1554,14 @@ static int dm_prep_fn(struct request_queue *q, struct request *rq) if (!clone) return BLKPREP_DEFER; + if (rq->cmd_flags & REQ_CLONED_FLUSH) + clone->cmd_flags |= REQ_FLUSH; + if (rq->cmd_flags & REQ_CLONED_FUA) + clone->cmd_flags |= REQ_FUA; + + rq->cmd_flags &= ~(REQ_FLUSH|REQ_FUA); + clone->cmd_flags &= ~(REQ_FLUSH|REQ_FUA); + rq->special = clone; rq->cmd_flags |= REQ_DONTPREP; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 71fc53b..6ef5dc3 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -153,6 +153,8 @@ enum rq_flag_bits { __REQ_FLUSH_SEQ, /* request for flush sequence */ __REQ_IO_STAT, /* account I/O stat */ __REQ_MIXED_MERGE, /* merge of different types, fail separately */ + __REQ_CLONED_FLUSH, + __REQ_CLONED_FUA, __REQ_NR_BITS, /* stops here */ }; @@ -170,7 +172,7 @@ enum rq_flag_bits { (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER) #define REQ_COMMON_MASK \ (REQ_WRITE | REQ_FAILFAST_MASK | REQ_SYNC | REQ_META | REQ_PRIO | \ - REQ_DISCARD | REQ_NOIDLE | REQ_FLUSH | REQ_FUA | REQ_SECURE) + REQ_DISCARD | REQ_NOIDLE | REQ_FLUSH | REQ_FUA | REQ_SECURE | REQ_CLONED_FLUSH | REQ_CLONED_FUA) #define REQ_CLONE_MASK REQ_COMMON_MASK #define REQ_RAHEAD (1 << __REQ_RAHEAD) @@ -194,5 +196,7 @@ enum rq_flag_bits { #define REQ_IO_STAT (1 << __REQ_IO_STAT) #define REQ_MIXED_MERGE (1 << __REQ_MIXED_MERGE) #define REQ_SECURE (1 << __REQ_SECURE) +#define REQ_CLONED_FLUSH (1 << __REQ_CLONED_FLUSH) +#define REQ_CLONED_FUA (1 << __REQ_CLONED_FUA) #endif /* __LINUX_BLK_TYPES_H */
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel