On Wed, Aug 17 2016 at 8:29pm -0400, Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > On 08/16/2016 07:43 PM, Mike Snitzer wrote: > >On Tue, Aug 16 2016 at 3:12pm -0400, Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > >>On Tue, Aug 16 2016 at 1:32pm -0400, Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > >>>Can you have a look at this? > >> > >>I'll get linux-dm.git's 'dm-4.8' branch (v4.8-rc2) loaded on one of my > >>testbed systems to run the mptest testsuite. It should provide coverage > >>for simple failover and failback. > > > >I successfully ran mptest against that v4.8-rc2 based kernel. Any > >chance you could try mptest to see if it also passes for you? Or if it > >is able to trigger the issue for you. > > > >mptest's runtest script defaults to setting up scsi-mq and dm-mq. > > Hello Mike, > > I will run mptest as soon as I have the time. OK, thanks. > Earlier today I came up with three patches that avoid the hang in > truncate_inode_pages_range() that I had reported before. It would be > appreciated if you could have a look at these patches. Please keep > in mind that I'm not a dm expert. I don't recall where you mentioned a hang in truncate_inode_pages_range().. ah I see it here: https://patchwork.kernel.org/patch/9202331/ Comments inlined below. > From 0e8e9f7d7489f5e3b0bf9e0c59257277d08a2ec0 Mon Sep 17 00:00:00 2001 > From: Bart Van Assche <bart.vanassche@xxxxxxxxxxx> > Date: Wed, 17 Aug 2016 14:18:36 -0700 > Subject: [PATCH 1/3] block: Avoid that requests get stuck when stopping and > restarting a queue > > If a driver calls blk_start_queue() after a queue has been marked > "dead", avoid that requests get stuck by invoking __blk_run_queue() > directly if QUEUE_FLAG_DEAD has been set. > > While testing the dm-mpath driver I noticed that sporadically a > readahead request was generated by the filesystem on top of the dm > device but that that I/O request was never processed. This happened > only while the dm core was suspending and resuming I/O.s > > The following backtrace (triggered by a WARN_ON_ONCE(true) statement > that I had inserted just before the __blk_run_queue(q) call) shows > that this code path is relevant: > > WARNING: CPU: 7 PID: 4707 at drivers/md/dm.c:2035 map_request+0x1d5/0x340 [dm_mod] > Aug 17 15:47:35 ion-dev-ib-ini kernel: CPU: 7 PID: 4707 Comm: kdmwork-254:0 Tainted: G W 4.7.0-dbg+ #2 > Call Trace: > [<ffffffff81322ee7>] dump_stack+0x68/0xa1 > [<ffffffff81061e06>] __warn+0xc6/0xe0 > [<ffffffff81061ed8>] warn_slowpath_null+0x18/0x20 > [<ffffffffa04282d5>] map_request+0x1d5/0x340 [dm_mod] > [<ffffffffa042845d>] map_tio_request+0x1d/0x40 [dm_mod] > [<ffffffff81085fe1>] kthread_worker_fn+0xd1/0x1b0 > [<ffffffff81085e9a>] kthread+0xea/0x100 > [<ffffffff8162857f>] ret_from_fork+0x1f/0x40 > > References: commit 704605711ef0 ("block: Avoid scheduling delayed work on a dead queue") > Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxxxxxx> > Cc: Christoph Hellwig <hch@xxxxxx> > Cc: Mike Snitzer <snitzer@xxxxxxxxxx> > Cc: stable <stable@xxxxxxxxxxxxxxx> > --- > block/blk-core.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 0d9638e..5f75709 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -355,6 +355,8 @@ void blk_run_queue_async(struct request_queue *q) > { > if (likely(!blk_queue_stopped(q) && !blk_queue_dead(q))) > mod_delayed_work(kblockd_workqueue, &q->delay_work, 0); > + else > + __blk_run_queue(q); > } > EXPORT_SYMBOL(blk_run_queue_async); Your else clause would catch the case when the queue is stopped but not dead... __blk_run_queue() will handle this cleanly but I think the logic could be: else if (blk_queue_dead(q)) Also, strikes me as odd to have an _async interface resort to a synchronous call. Why is that needed? Are you sure it is safe for all callers? > From 9b103cfb2d9b4cdec73c7d09188b4d42b041d49e Mon Sep 17 00:00:00 2001 > From: Bart Van Assche <bart.vanassche@xxxxxxxxxxx> > Date: Wed, 17 Aug 2016 16:26:02 -0700 > Subject: [PATCH 2/3] dm: Change return type of __dm_internal_suspend() to int > > --- > drivers/md/dm.c | 15 ++++++++++----- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index 3689b7f..6eccdf3 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -3379,16 +3379,18 @@ out: > * It may be used only from the kernel. > */ > > -static void __dm_internal_suspend(struct mapped_device *md, unsigned suspend_flags) > +static int __dm_internal_suspend(struct mapped_device *md, > + unsigned suspend_flags) > { > struct dm_table *map = NULL; > + int ret = 0; > > if (md->internal_suspend_count++) > - return; /* nested internal suspend */ > + goto out; /* nested internal suspend */ > > if (dm_suspended_md(md)) { > set_bit(DMF_SUSPENDED_INTERNALLY, &md->flags); > - return; /* nest suspend */ > + goto out; /* nest suspend */ > } > > map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock)); > @@ -3399,10 +3401,13 @@ static void __dm_internal_suspend(struct mapped_device *md, unsigned suspend_fla > * would require changing .presuspend to return an error -- avoid this > * until there is a need for more elaborate variants of internal suspend. > */ > - (void) __dm_suspend(md, map, suspend_flags, TASK_UNINTERRUPTIBLE, > - DMF_SUSPENDED_INTERNALLY); > + ret = __dm_suspend(md, map, suspend_flags, TASK_UNINTERRUPTIBLE, > + DMF_SUSPENDED_INTERNALLY); > > dm_table_postsuspend_targets(map); > + > +out: > + return ret; > } > > static void __dm_internal_resume(struct mapped_device *md) So you're only wanting to have your new __dm_internal_suspend() caller, __dm_destroy, issue a WARN_ON if __dm_suspend() fails. Is this _really_ important or did you just do this for debugging purposes? > From f6e58e71eec584b3f66fe8367340ca896d29408e Mon Sep 17 00:00:00 2001 > From: Bart Van Assche <bart.vanassche@xxxxxxxxxxx> > Date: Wed, 17 Aug 2016 15:26:07 -0700 > Subject: [PATCH 3/3] dm: Avoid that requests get stuck while destroying a dm > device > > --- > drivers/md/dm.c | 25 ++++++++++++++----------- > 1 file changed, 14 insertions(+), 11 deletions(-) > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index 6eccdf3..ae4bfa0 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -2958,10 +2958,13 @@ const char *dm_device_name(struct mapped_device *md) > } > EXPORT_SYMBOL_GPL(dm_device_name); > > +static int __dm_internal_suspend(struct mapped_device *md, > + unsigned suspend_flags); > + > static void __dm_destroy(struct mapped_device *md, bool wait) > { > - struct dm_table *map; > - int srcu_idx; > + struct request_queue *q = dm_get_md_queue(md); > + int res; > > might_sleep(); > > @@ -2970,21 +2973,21 @@ static void __dm_destroy(struct mapped_device *md, bool wait) > set_bit(DMF_FREEING, &md->flags); > spin_unlock(&_minor_lock); > > - if (dm_request_based(md) && md->kworker_task) > - flush_kthread_worker(&md->kworker); Header needs detail. E.g. __dm_suspend() via, __dm_internal_suspend(), will call flush_kthread_worker(). But that is a nit in the grand scheme of things. Bigger concerns below. > + /* > + * Avoid that dm_make_request() gets called after > + * __dm_internal_suspend() finished. > + */ > + spin_lock_irq(q->queue_lock); > + queue_flag_set(QUEUE_FLAG_DYING, q); > + spin_unlock_irq(q->queue_lock); Feels like the case for marking the queue as DYING can be made independent of changes to the mechanism for flushing IO here. > > /* > * Take suspend_lock so that presuspend and postsuspend methods > * do not race with internal suspend. > */ > mutex_lock(&md->suspend_lock); > - map = dm_get_live_table(md, &srcu_idx); > - if (!dm_suspended_md(md)) { > - dm_table_presuspend_targets(map); > - dm_table_postsuspend_targets(map); > - } > - /* dm_put_live_table must be before msleep, otherwise deadlock is possible */ > - dm_put_live_table(md, srcu_idx); > + res = __dm_internal_suspend(md, DM_SUSPEND_NOFLUSH_FLAG); > + WARN_ONCE(res, "__dm_internal_suspend() returned %d\n", res); > mutex_unlock(&md->suspend_lock); > > /* Please be explicit about why switching to using __dm_internal_suspend() is important. Are you just going for code reuse or is there a more important reason? I can infer you probably want the call to dm_stop_queue()... But I need way more convincing on why __dm_internal_suspend() is needed. As you can see from the header for commit ffcc3936416, the internal suspend code is relatively complex. Increasing its use, especially in a method such as __dm_destroy, needs a good reason. But that aside, what raises serious concerns for me is that you're using DM_SUSPEND_NOFLUSH_FLAG where we absolutely _need_ to flush IO (given we're in __dm_destroy). At a minimum DM_SUSPEND_NOFLUSH_FLAG must not be used. Mike -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel