On 22/02/09 03:26PM, Ritesh Harjani wrote: > On 22/02/08 05:14PM, Jan Kara wrote: > > On Tue 08-02-22 04:49:19, syzbot wrote: > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit: 555f3d7be91a Merge tag '5.17-rc3-ksmbd-server-fixes' of gi.. > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=17e55852700000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=88e0a6a3dbf057cf > > > dashboard link: https://syzkaller.appspot.com/bug?extid=afa2ca5171d93e44b348 > > > compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2 > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13b03872700000 > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > Reported-by: syzbot+afa2ca5171d93e44b348@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > So the syzbot reproducer looks bogus to me but the bug is real. > > jbd2_journal_wait_updates() looks at commit_transaction after dropping > > j_state_lock and sleeping which is certainly prone to use-after-free > > issues. > > Yes, thanks for taking a look at it. > > > > > Funnily, Ritesh's removal of t_handle_lock should "fix" the problem by > > removing this dereference. So Ritesh, please just add some reference to > > syzbot report and maybe a backport to stable would be warranted. > > > > This actually looks like a regression because of commit [1]. > So I am thinking of sending a separate patch [2] as a fix for this (after my > testing). > > Not sure why fstests testing didn't pick this up (given it's a common race), > or is it because of my recent removal of CONFIG_KASAN from my testing :( > > I will try a full "auto" test with CONFIG_KASAN enabled and see if we could hit > this in fstests or not. If not then I will work towards adding a targetted test > to capture such race. FWIW, I did try "auto" run with CONFIG_KASAN enabled (which took hell lot of a time to complete :-|) w/o the fix. But it didn't hit this race. Then, I also tried below combination of dirtying some data and calling checkpoint_journal in seperate thread, but I wasn't able to hit this race. (Using checkpoint_journal since it will call for jbd2_journal_lock_updates()) sudo mount /dev/loop2 /mnt Terminal-1 # Tried other combination of higher D, n & S0/1/2, s0/100/4096 =========== qemu-> echo /mnt/{1..32} |sed -E 's/[[:space:]]+/ -d /g' | xargs -I {} bash -c "sudo fs_mark -L 100000 -D 4 -n 4 -s 100 -S0 -d {}" Terminal-2 ============= qemu-> while [ 1 ]; do sudo ./src/checkpoint_journal /mnt; sleep 5; done Also, I did made other failed attempts with some combinations of running fs_mark with freeze/unfreeze (similar to checkpoint_journal), and also changing "commit=1" mount option to see if we could hit the race with some such combo. So if anyone else has any suggestions on what else can we try to hit this race, so that we can add such test case to fstests, I will be happy to attempt it. -ritesh > > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git/commit/?h=origin&id=4f98186848707f530669238d90e0562d92a78aab > [2]: https://github.com/riteshharjani/linux/commit/628648810011a22dfaba38ead49716720b27a31c > > -ritesh