Re: [syzbot] possible deadlock in p9_write_work

Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> · Thu, 31 Mar 2022 08:43:32 +0900

Hello.

Since "ext4: truncate during setxattr leads to kernel panic" did not choose
per-superblock WQ, ext4_put_super() for some ext4 superblock currently waits
for completion of iput() from delayed_iput_fn() from delayed_iput() from
ext4_xattr_set_entry() from all ext4 superblocks (in addition to other tasks
scheduled by unrelated subsystems).

If ext4_put_super() for some superblock wants to wait for only works from that
superblock, please use per-superblock WQ. Creating per-superblock WQ via
alloc_workqueue() without WQ_MEM_RECLAIM flag will not consume much resource.

If ext4_put_super() for some superblock can afford waiting for iput() from
other ext4 superblocks, you can use per-filesystem WQ.

On 2022/03/31 1:56, Perepechko, Andrew wrote:
> Hello Tetsuo!
> 
> Thank you for your report.
> 
> I wonder if I can fix this issue by creating a separate per-superblock workqueue.
> 
> I may not fully understand the lockdep magic in process_one_work() so any advice is appreciated.
> 
> As I see it, if there's no shared locking between different workqueues, unmount should be able to flush only its own scheduled tasks (which should not conflict with any p9 tasks) and unblock the locking chain under similar conditions.
> 
> Thank you,
> Andrew
> ________________________________
> From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Sent: 30 March 2022 05:49
> To: Dominique Martinet <asmadeus@xxxxxxxxxxxxx>
> Cc: Perepechko, Andrew <andrew.perepechko@xxxxxxx>; Andreas Dilger <adilger@xxxxxxxxx>; Theodore Ts'o <tytso@xxxxxxx>; syzbot <syzbot+bde0f89deacca7c765b8@xxxxxxxxxxxxxxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx <linux-kernel@xxxxxxxxxxxxxxx>; syzkaller-bugs@xxxxxxxxxxxxxxxx <syzkaller-bugs@xxxxxxxxxxxxxxxx>; v9fs-developer@xxxxxxxxxxxxxxxxxxxxx <v9fs-developer@xxxxxxxxxxxxxxxxxxxxx>; open list:EXT4 FILE SYSTEM <linux-ext4@xxxxxxxxxxxxxxx>
> Subject: Re: [syzbot] possible deadlock in p9_write_work
> 
> On 2022/03/30 11:29, Dominique Martinet wrote:
>> Tetsuo Handa wrote on Wed, Mar 30, 2022 at 10:57:15AM +0900:
>>>>> Please don't use schedule_work() if you need to use flush_scheduled_work().
>>>>
>>>> In this case we don't call flush_scheduled_work -- ext4 does.
>>>
>>> Yes, that's why I changed recipients to ext4 people.
>>
>> Sorry, I hadn't noticed.
>> 9p is the one calling schedule_work, so ultimately it really is the
>> combinaison of the two, and not just ext4 that's wrong here.
> 
> Calling schedule_work() itself does not cause troubles (unless there are
> too many pending works to make progress). Calling flush_scheduled_work()
> causes troubles because it waits for completion of all works even if
> some of works cannot be completed due to locks held by the caller of
> flush_scheduled_work(). 9p is innocent for this report.
> 
>