On Mon, Dec 14 2015 at 3:41P -0500, Nikolay Borisov <kernel@xxxxxxxx> wrote: > Had another poke at the backtrace that is produced and here what the > delayed_work looks like: > > crash> struct delayed_work ffff88036772c8c0 > struct delayed_work { > work = { > data = { > counter = 1537 > }, > entry = { > next = 0xffff88036772c8c8, > prev = 0xffff88036772c8c8 > }, > func = 0xffffffffa0211a30 <do_waker> > }, > timer = { > entry = { > next = 0x0, > prev = 0xdead000000200200 > }, > expires = 4349463655, > base = 0xffff88047fd2d602, > function = 0xffffffff8106da40 <delayed_work_timer_fn>, > data = 18446612146934696128, > slack = -1, > start_pid = -1, > start_site = 0x0, > start_comm = > "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000" > }, > wq = 0xffff88030cf65400, > cpu = 21 > } > > From this it seems that the timer is also cancelled/expired judging by > the values in timer -> entry. But then again in dm-thin the pool is > first suspended, which implies the following functions were called: > > cancel_delayed_work(&pool->waker); > cancel_delayed_work(&pool->no_space_timeout); > flush_workqueue(pool->wq); > > so at that point dm-thin's workqueue should be empty and it shouldn't be > possible to queue any more delayed work. But the crashdump clearly shows > that the opposite is happening. So far all of this points to a race > condition and inserting some sleeps after umount and after vgchange -Kan > (command to disable volume group and suspend, so the cancel_delayed_work > is invoked) seems to reduce the frequency of crashes, though it doesn't > eliminate them. 'vgchange -Kan' doesn't suspend the pool before it destroys the device. So the cancel_delayed_work()s you referenced aren't applicable. Can you try this patch? diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 63903a5..b201d887 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -2750,8 +2750,11 @@ static void __pool_destroy(struct pool *pool) dm_bio_prison_destroy(pool->prison); dm_kcopyd_client_destroy(pool->copier); - if (pool->wq) + if (pool->wq) { + cancel_delayed_work(&pool->waker); + cancel_delayed_work(&pool->no_space_timeout); destroy_workqueue(pool->wq); + } if (pool->next_mapping) mempool_free(pool->next_mapping, pool->mapping_pool); -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel