Re: kernel BUG in ext4_writepages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/20/22 02:50, Jan Kara wrote:
On Thu 19-05-22 16:14:17, Tadeusz Struk wrote:
On 5/19/22 05:23, Jan Kara wrote:
Hi!

On Tue 10-05-22 15:28:38, Tadeusz Struk wrote:
Syzbot found another BUG in ext4_writepages [1].
This time it complains about inode with inline data.
C reproducer can be found here [2]
I was able to trigger it on 5.18.0-rc6

[1] https://syzkaller.appspot.com/bug?id=a1e89d09bbbcbd5c4cb45db230ee28c822953984
[2] https://syzkaller.appspot.com/text?tag=ReproC&x=129da6caf00000

Thanks for report. This should be fixed by:

https://lore.kernel.org/all/20220516012752.17241-1-yebin10@xxxxxxxxxx/


In case of the syzbot bug there is something messed up with PAGE DIRTY flags
and the way syzbot sets up the write. This is what triggers the crash:

Can you tell me where exactly we hit the bug? I've now noticed that this is
on 5.10 kernel and on vanilla 5.10 there's no BUG_ON on line 2753.

We are hiting this bug:
https://elixir.bootlin.com/linux/latest/source/fs/ext4/inode.c#L2707
Syzbot found it in v5.10, but I recreated it on 5.18-rc7, that's why
the line number mismatch. But this is the same bug.
On 5.10 it's in line 2739:
https://elixir.bootlin.com/linux/v5.10.117/source/fs/ext4/inode.c#L2739


$ ftrace -f ./repro
...
[pid  2395] open("./bus", O_RDWR|O_CREAT|O_SYNC|O_NOATIME, 000 <unfinished ...>
[pid  2395] <... open resumed> )        = 6
...
[pid  2395] write(6, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 22 <unfinished ...>
...
[pid  2395] <... write resumed> )       = 22

One way I could fix it was to clear the PAGECACHE_TAG_DIRTY on the mapping in
ext4_try_to_write_inline_data() after the page has been updated:

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 9c076262770d..e4bbb53fa26f 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -715,6 +715,7 @@ int ext4_try_to_write_inline_data(struct address_space *mapping,
  			put_page(page);
  			goto out_up_read;
  		}
+		__xa_clear_mark(&mapping->i_pages, 0, PAGECACHE_TAG_DIRTY);
  	}
  	ret = 1;

Please let me know it if makes sense any I will send a proper patch.

No, this looks really wrong... We need to better understand what's going
on.

So I was afraid. I'm trying to diverge the ext4_writepages() to go to the
out_writepages path before we hit this BOG_ON().
Any hints will be much appreciated.

--
Thanks,
Tadeusz



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux