On Wed, Sep 27 2023 at 12:42P -0400, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > On Tue, Sep 26, 2023 at 11:53:33PM +0100, Matthew Wilcox wrote: > > I'm going to sleep now instead of running the last 10 steps of the > > bisect. If nobody's found this when I wake up, I'll finish it then. > > Bisection found it. I confirmed by hand; checking out this commit > yields a failed test, and then reverting it leads to a success. > > commit 026e4728c276cdf3ec618a71a38181864596027b > Author: Joe Thornber <ejt@xxxxxxxxxx> > Date: Wed Sep 13 10:39:09 2023 +0100 > > dm thin: Use the extent allocator for data blocks > > The thin_pool object now contains an extent-allocator, and each thin > device contains an allocation-context from this. The allocation > context is used to guide data block allocations. The actual > allocation book-keeping is still done by the space-map. > > 2 new specific unit tests were added to dm-unit: > > /thinp/fragmentation/thins > /thinp/fragmentation/snapshots > > https://github.com/jthornber/dm-unit/blob/main/src/tests/thinp.rs > > Signed-off-by: Joe Thornber <ejt@xxxxxxxxxx> > Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> > > Joe, in case you missed the earlier splat: ... > > This seems fairly clear to me; there's a spin_lock() around the call to > __alloc() in dm_ea_context_alloc(), which then calls all the way down > that stack until you get to cache_get(), which takes a semaphore and > the locking validation quite reasonably says "You can't do that". > > I'm sure you don't need my help coming up with a fix. Although I might > ask that you turn on at least some basic locking checks in future while > developing your code, even if not full lockdep. I think this particular > warning comes out of CONFIG_DEBUG_ATOMIC_SLEEP=y. Thanks for the report and bisecting -- wish I caught you earlier to save you the hassle (was immediately clear which commit caused it when I saw the trace). Joe is looking at how best to fix and also updating dmtest-python [1] to scrape the kernel log to pickup such bugs (as-is the kernel will carry on despite the splat, so far anyway). But I've dropped the related code from linux-next until this is all fixed properly. Mike [1] https://github.com/jthornber/dmtest-python.git -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel