Re: Sleeping function called from invalid context

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 27 2023 at 12:42P -0400,
Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:

> On Tue, Sep 26, 2023 at 11:53:33PM +0100, Matthew Wilcox wrote:
> > I'm going to sleep now instead of running the last 10 steps of the
> > bisect.  If nobody's found this when I wake up, I'll finish it then.
> 
> Bisection found it.  I confirmed by hand; checking out this commit
> yields a failed test, and then reverting it leads to a success.
> 
> commit 026e4728c276cdf3ec618a71a38181864596027b
> Author: Joe Thornber <ejt@xxxxxxxxxx>
> Date:   Wed Sep 13 10:39:09 2023 +0100
> 
>     dm thin: Use the extent allocator for data blocks
> 
>     The thin_pool object now contains an extent-allocator, and each thin
>     device contains an allocation-context from this.  The allocation
>     context is used to guide data block allocations.  The actual
>     allocation book-keeping is still done by the space-map.
> 
>     2 new specific unit tests were added to dm-unit:
> 
>        /thinp/fragmentation/thins
>        /thinp/fragmentation/snapshots
> 
>        https://github.com/jthornber/dm-unit/blob/main/src/tests/thinp.rs
> 
>     Signed-off-by: Joe Thornber <ejt@xxxxxxxxxx>
>     Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx>
> 
> Joe, in case you missed the earlier splat:

...

> 
> This seems fairly clear to me; there's a spin_lock() around the call to
> __alloc() in dm_ea_context_alloc(), which then calls all the way down
> that stack until you get to cache_get(), which takes a semaphore and
> the locking validation quite reasonably says "You can't do that".
> 
> I'm sure you don't need my help coming up with a fix.  Although I might
> ask that you turn on at least some basic locking checks in future while
> developing your code, even if not full lockdep.  I think this particular
> warning comes out of CONFIG_DEBUG_ATOMIC_SLEEP=y.

Thanks for the report and bisecting -- wish I caught you earlier to
save you the hassle (was immediately clear which commit caused it when
I saw the trace).

Joe is looking at how best to fix and also updating dmtest-python [1]
to scrape the kernel log to pickup such bugs (as-is the kernel will
carry on despite the splat, so far anyway).

But I've dropped the related code from linux-next until this is all
fixed properly.

Mike

[1] https://github.com/jthornber/dmtest-python.git

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux