Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek

Amir Goldstein <amir73il@xxxxxxxxx> · Fri, 5 Apr 2024 13:34:11 +0300

On Fri, Apr 5, 2024 at 1:01 AM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
>
> > This specifically cannot happen because sysfs is not allowed as an
> > upper layer only as a lower layer, so overlayfs itself will not be writing to
> > /sys/power/resume.
>
> Then how could you possibly get a deadlock there?  What would your minimal
> deadlocked set look like?
>
> 1.  Something is blocked in lookup_bdev() called from resume_store(), called
> from sysfs_kf_write(), called from kernfs_write_iter(), which has acquired
> ->mutex of struct kernfs_open_file that had been allocated by
> kernfs_fop_open() back when the file had been opened.  Note that each
> struct file instance gets a separate struct kernfs_open_file.  Since we are
> calling ->write_iter(), the file *MUST* have been opened for write.
>
> 2.  Something is blocked in kernfs_fop_llseek() on the same of->mutex,
> i.e. using the same struct file as (1).  That something is holding an
> overlayfs inode lock, which is what the next thread is blocked on.
>
> + at least one more thread, to complete the cycle.
>
> Right?  How could that possibly happen without overlayfs opening /sys/power/resume
> for write?  Again, each struct file instance gets a separate of->mutex;
> for a deadlock you need a cycle of threads and a cycle of locks, such
> that each thread is holding the corresponding lock and is blocked on
> attempt to get the lock that comes next in the cyclic order.

Absolutely right.
I had it in my mind that this was a node lock. Did not look closely.

>
> If overlayfs never writes to that sucker, it can't participate in that
> cycle.  Sure, you can get overlayfs llseek grabbing of->mutex of *ANOTHER*
> struct file opened for the same sysfs file.  Since it's not the same
> struct file and since each struct file there gets a separate kernfs_open_file
> instance, the mutex won't be the same.
>
> Unless I'm missing something else, that can't deadlock.  For a quick and
> dirty experiment, try to give of->mutex on r/o opens a class separate from
> that on r/w and w/o opens (mutex_init() in kernfs_fop_open()) and see
> if lockdep warnings persist.
>
> Something like
>
>         if (has_mmap)
>                 mutex_init(&of->mutex);
>         else if (file->f_mode & FMODE_WRITE)
>                 mutex_init(&of->mutex);
>         else
>                 mutex_init(&of->mutex);

Why a quick experiment?
Why not a permanent kludge?

It is not any better or worse than the already existing has_mmap
subclass annotation. huh?

Thanks,
Amir.