On Tue, Nov 19, 2024 at 08:59:19AM -0500, Jeff Layton wrote: > On Sat, 2024-11-16 at 10:07 -0500, Chuck Lever wrote: > > On Fri, Nov 15, 2024 at 05:31:38PM -0800, Hugh Dickins wrote: > > > On Fri, 15 Nov 2024, Chuck Lever wrote: > > > > > > > > As I said before, I've failed to find any file system getattr method > > > > that explicitly takes the inode's semaphore around a > > > > generic_fillattr() call. My understanding is that these methods > > > > assume that their caller handles appropriate serialization. > > > > Therefore, taking the inode semaphore at all in shmem_getattr() > > > > seems to me to be the wrong approach entirely. > > > > > > > > The point of reverting immediately is that any fix can't possibly > > > > get the review and testing it deserves three days before v6.12 > > > > becomes final. Also, it's not clear what the rush to fix the > > > > KCSAN splat is; according to the Fixes: tag, this issue has been > > > > present for years without causing overt problems. It doesn't feel > > > > like this change belongs in an -rc in the first place. > > > > > > > > Please revert d949d1d14fa2, then let's discuss a proper fix in a > > > > separate thread. Thanks! > > > > > > Thanks so much for reporting this issue, Chuck: just in time. > > > > > > I agree abso-lutely with you: I was just preparing a revert, > > > when I saw that akpm is already on it: great, thanks Andrew. > > > > Thanks to you both for jumping on this! > > > > > > > I was not very keen to see that locking added, just to silence a syzbot > > > sanitizer splat: added where there has never been any practical problem > > > (and the result of any stat immediately stale anyway). I was hoping we > > > might get a performance regression reported, but a hang serves better! > > > > > > If there's a "data_race"-like annotation that can be added to silence > > > the sanitizer, okay. But more locking? I don't think so. > > > > IMHO the KCSAN splat suggests there is a missing inode_lock/unlock > > pair /somewhere/. Just not in shmem_getattr(). > > > > Earlier in this thread, Jeongjun reported: > > > ... I found that this data-race mainly occurs when vfs_statx_path > > > does not acquire the inode_lock ... > > > > That sounds to me like a better place to address this; or at least > > that is a good place to start looking for the problem. > > > > I don't think this data race is anything to worry about: > > ================================================================== > BUG: KCSAN: data-race in generic_fillattr / inode_set_ctime_current > > write to 0xffff888102eb3260 of 4 bytes by task 6565 on cpu 1: > inode_set_ctime_to_ts include/linux/fs.h:1638 [inline] > inode_set_ctime_current+0x169/0x1d0 fs/inode.c:2626 > shmem_mknod+0x117/0x180 mm/shmem.c:3443 > shmem_create+0x34/0x40 mm/shmem.c:3497 > lookup_open fs/namei.c:3578 [inline] > open_last_lookups fs/namei.c:3647 [inline] > path_openat+0xdbc/0x1f00 fs/namei.c:3883 > > write to 0xffff888102eb3260 of 4 bytes by task 6565 on cpu 1: > inode_set_ctime_to_ts include/linux/fs.h:1638 [inline] > inode_set_ctime_current+0x169/0x1d0 fs/inode.c:2626 > shmem_mknod+0x117/0x180 mm/shmem.c:3443 > shmem_create+0x34/0x40 mm/shmem.c:3497 > lookup_open fs/namei.c:3578 [inline] > open_last_lookups fs/namei.c:3647 [inline] > path_openat+0xdbc/0x1f00 fs/namei.c:3883 > do_filp_open+0xf7/0x200 fs/namei.c:3913 > do_sys_openat2+0xab/0x120 fs/open.c:1416 > do_sys_open fs/open.c:1431 [inline] > __do_sys_openat fs/open.c:1447 [inline] > __se_sys_openat fs/open.c:1442 [inline] > __x64_sys_openat+0xf3/0x120 fs/open.c:1442 > x64_sys_call+0x1025/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:258 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > read to 0xffff888102eb3260 of 4 bytes by task 3498 on cpu 0: > inode_get_ctime_nsec include/linux/fs.h:1623 [inline] > inode_get_ctime include/linux/fs.h:1629 [inline] > generic_fillattr+0x1dd/0x2f0 fs/stat.c:62 > shmem_getattr+0x17b/0x200 mm/shmem.c:1157 > vfs_getattr_nosec fs/stat.c:166 [inline] > vfs_getattr+0x19b/0x1e0 fs/stat.c:207 > vfs_statx_path fs/stat.c:251 [inline] > vfs_statx+0x134/0x2f0 fs/stat.c:315 > vfs_fstatat+0xec/0x110 fs/stat.c:341 > __do_sys_newfstatat fs/stat.c:505 [inline] > __se_sys_newfstatat+0x58/0x260 fs/stat.c:499 > __x64_sys_newfstatat+0x55/0x70 fs/stat.c:499 > x64_sys_call+0x141f/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:263 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > value changed: 0x2755ae53 -> 0x27ee44d3 > > Reported by Kernel Concurrency Sanitizer on: > CPU: 0 UID: 0 PID: 3498 Comm: udevd Not tainted 6.11.0-rc6-syzkaller-00326-gd1f2d51b711a-dirty #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 > ================================================================== > > inode_set_ctime_to_ts() is just setting the ctime fields in the inode > to the timespec64. inode_get_ctime_nsec() is just fetching the ctime > nsec field. > > We've never tried to synchronize readers and writers when it comes to > timestamps. It has always been possible to read an inconsistent > timestamp from the inode. The sec and nsec fields are in different > words, and there is no synchronization when updating the fields. > > Timestamp fetches seem like the right place to use a seqcount loop, but > I don't think we would want to add any sort of locking to the update > side of things. Maybe that's doable? I don't feel that the observed data race is a major issue, but it did seem clear that it was not an issue that was restricted to shmem_getattr(). What does seem like an important issue is that there doesn't appear to be a way to annotate "data race is expected" areas so that KCSAN does not generate a splat. False positives reduce the signal-to- noise ratio, making the tool pretty useless over time. -- Chuck Lever