On Fri 18-08-23 13:21:17, Matthew Wilcox wrote: > On Fri, Aug 18, 2023 at 10:01:32AM +0200, Mirsad Todorovac wrote: > > [ 206.510010] ================================================================== > > [ 206.510035] BUG: KCSAN: data-race in xas_clear_mark / xas_find_marked > > > > [ 206.510067] write to 0xffff963df6a90fe0 of 8 bytes by interrupt on cpu 22: > > [ 206.510081] xas_clear_mark+0xd5/0x180 > > [ 206.510097] __xa_clear_mark+0xd1/0x100 > > [ 206.510114] __folio_end_writeback+0x293/0x5a0 > > [ 206.520722] read to 0xffff963df6a90fe0 of 8 bytes by task 2793 on cpu 6: > > [ 206.520735] xas_find_marked+0xe5/0x600 > > [ 206.520750] filemap_get_folios_tag+0xf9/0x3d0 > Also, before submitting this kind of report, you should run the > trace through scripts/decode_stacktrace.sh to give us line numbers > instead of hex offsets, which are useless to anyone who doesn't have > your exact kernel build. > > > [ 206.510010] ================================================================== > > [ 206.510035] BUG: KCSAN: data-race in xas_clear_mark / xas_find_marked > > > > [ 206.510067] write to 0xffff963df6a90fe0 of 8 bytes by interrupt on cpu 22: > > [ 206.510081] xas_clear_mark (./arch/x86/include/asm/bitops.h:178 ./include/asm-generic/bitops/instrumented-non-atomic.h:115 lib/xarray.c:102 lib/xarray.c:914) > > [ 206.510097] __xa_clear_mark (lib/xarray.c:1923) > > [ 206.510114] __folio_end_writeback (mm/page-writeback.c:2981) > > This path is properly using xa_lock_irqsave() before calling > __xa_clear_mark(). > > > [ 206.520722] read to 0xffff963df6a90fe0 of 8 bytes by task 2793 on cpu 6: > > [ 206.520735] xas_find_marked (./include/linux/xarray.h:1706 lib/xarray.c:1354) > > [ 206.520750] filemap_get_folios_tag (mm/filemap.c:1975 mm/filemap.c:2273) > > This takes the RCU read lock before calling xas_find_marked() as it's > supposed to. > > What garbage do I have to write to tell KCSAN it's wrong? The line > that's probably triggering it is currently: > > unsigned long data = *addr & (~0UL << offset); I don't think it is actually wrong in this case. You're accessing xarray only with RCU protection so it can be changing under your hands. For example the code in xas_find_chunk(): unsigned long data = *addr & (~0UL << offset); if (data) return __ffs(data); is prone to the compiler refetching 'data' from *addr after checking for data != 0 and getting 0 the second time which would trigger undefined behavior of __ffs(). So that code should definitely use READ_ONCE() to make things safe. BTW, find_next_bit() seems to need a similar treatment and in fact I'm not sure why xas_find_chunk() has a special case for XA_CHUNK_SIZE == BITS_PER_LONG because find_next_bit() checks for that and handles that in a fast path in the same way. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR