On Wed, Jan 24, 2024 at 11:01:20AM +0100, Jan Kara wrote: > On Wed 24-01-24 00:50:10, syzbot wrote: > > syzbot suspects this issue was fixed by commit: > > > > commit 6f861765464f43a71462d52026fbddfc858239a5 > > Author: Jan Kara <jack@xxxxxxx> > > Date: Wed Nov 1 17:43:10 2023 +0000 > > > > fs: Block writes to mounted block devices > > > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=119af36be80000 > > start commit: 17214b70a159 Merge tag 'fsverity-for-linus' of git://git.k.. > > git tree: upstream > > kernel config: https://syzkaller.appspot.com/x/.config?x=d40f6d44826f6cf7 > > dashboard link: https://syzkaller.appspot.com/bug?extid=87466712bb342796810a > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1492946ac80000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12e45ad6c80000 > > So this surprises me a bit because XFS isn't using block device buffer > cache and thus syzbot has no way of corrupting cached metadata even before > these changes. The reproducer tries to mount the loop device again after > mounting the XFS image so I can imagine something bad happens but it isn't > all that clear what. So I'll defer to XFS maintainers whether they want to > mark this bug as fixed or investigate further. I've consistently ignored this bug because it is doing stuff with corrupted V4 XFS filesystems. The key part of the strace is as follows. /dev/loop0 has been set up to point to a memfd that maps the reproducer's internal memory. Then we see: .... [ 50.859261][ T5070] XFS (loop0): Deprecated V4 format (crc=0) will not be supported after September 2030. [ 50.869976][ T5070] XFS (loop0): Mounting V4 Filesystem 5e6273b8-2167-42bb-911b-418aa14a1261 [pid 5070] mount("/dev/loop0", "./file0", "xfs", 0, "filestreams,swidth=0x0000000000000000,nodiscard,logbufs=00000000000000000006,attr2,,nouuid") = 0 [pid 5070] openat(AT_FDCWD, "./file0", O_RDONLY|O_DIRECTORY) = 3 [pid 5070] chdir("./file0") = 0 [pid 5070] ioctl(4, LOOP_CLR_FD) = 0 [pid 5070] close(4) = 0 [pid 5070] open("./bus", O_RDWR|O_CREAT|O_TRUNC|O_NONBLOCK|O_SYNC|O_DIRECT|O_LARGEFILE|O_NOATIME, 000) = 4 [pid 5070] mount("/dev/loop0", "./bus", NULL, MS_BIND, NULL) = 0 [pid 5070] open("./bus", O_RDWR|O_NOCTTY|O_SYNC|O_NOATIME|0x3c) = 5 [pid 5070] openat(AT_FDCWD, "memory.current", O_RDWR|O_CREAT|O_NOCTTY|O_TRUNC|O_APPEND|FASYNC|0x18, 000) = 6 And then there's a corruption report and then the KASAN error. AFAICT, what the reproducer is doing is setting internal memory as backing device for /dev/loop0, then mounting it, then creating a file in that XFS filesystem, then doing a bind mount of /dev/loop0 to that file, then opening that file again (which now points to /dev/loop0) and overwriting it. As XFS writes back the data to the file, it's actually overwriting the loop device backing file. i.e. scribbling over the internal memory of the syzkaller program. The filesystem then goes to read metadata from the filesystem, and gets back metadata containing: [ 52.672334][ T4733] 00000000: 66 69 6c 65 73 74 72 65 61 6d 73 2c 73 77 69 64 filestreams,swid [ 52.681652][ T4733] 00000010: 74 68 3d 30 78 30 30 30 30 30 30 30 30 30 30 30 th=0x00000000000 [ 52.690810][ T4733] 00000020: 30 30 30 30 30 2c 6e 6f 64 69 73 63 61 72 64 2c 00000,nodiscard, [ 52.700296][ T4733] 00000030: 6c 6f 67 62 75 66 73 3d 30 30 30 30 30 30 30 30 logbufs=00000000 [ 52.709453][ T4733] 00000040: 30 30 30 30 30 30 30 30 30 30 30 36 2c 61 74 74 000000000006,att [ 52.718572][ T4733] 00000050: 72 32 2c 00 47 ba 76 39 f2 50 ff 99 2f fb b8 b1 r2,.G.v9.P../... [ 52.728140][ T4733] 00000060: 4c 3a 9b c2 e1 81 d0 9c 24 97 6b 33 7f 55 f4 90 L:......$.k3.U.. [ 52.737174][ T4733] 00000070: 15 4c b3 65 d3 52 86 f0 51 c3 11 75 df a1 cc f1 .L.e.R..Q..u.... The mount options used to mount the filesystem in the first place. I'd suggest that ithe commit the bisect landed on is blocking the second open of "./bus" after it was bind mounted to /dev/loop0 and so the write that corrupts the filesystem image never occurs, and so the XFS filesystem never trips over that error and hence never triggers the KASAN warning. So, yeah, I can see why syzbot might think that commit fixed the problem. But it didn't - it just broke the reproducer so the corruption that triggered the problem never manifested... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx