Re: [BUG] KCSAN: data-race in xas_clear_mark / xas_find_marked

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 9/14/23 10:08, Jan Kara wrote:
On Fri 18-08-23 13:21:17, Matthew Wilcox wrote:
On Fri, Aug 18, 2023 at 10:01:32AM +0200, Mirsad Todorovac wrote:
[  206.510010] ==================================================================
[  206.510035] BUG: KCSAN: data-race in xas_clear_mark / xas_find_marked

[  206.510067] write to 0xffff963df6a90fe0 of 8 bytes by interrupt on cpu 22:
[  206.510081]  xas_clear_mark+0xd5/0x180
[  206.510097]  __xa_clear_mark+0xd1/0x100
[  206.510114]  __folio_end_writeback+0x293/0x5a0
[  206.520722] read to 0xffff963df6a90fe0 of 8 bytes by task 2793 on cpu 6:
[  206.520735]  xas_find_marked+0xe5/0x600
[  206.520750]  filemap_get_folios_tag+0xf9/0x3d0
Also, before submitting this kind of report, you should run the
trace through scripts/decode_stacktrace.sh to give us line numbers
instead of hex offsets, which are useless to anyone who doesn't have
your exact kernel build.

[  206.510010] ==================================================================
[  206.510035] BUG: KCSAN: data-race in xas_clear_mark / xas_find_marked

[  206.510067] write to 0xffff963df6a90fe0 of 8 bytes by interrupt on cpu 22:
[  206.510081] xas_clear_mark (./arch/x86/include/asm/bitops.h:178 ./include/asm-generic/bitops/instrumented-non-atomic.h:115 lib/xarray.c:102 lib/xarray.c:914)
[  206.510097] __xa_clear_mark (lib/xarray.c:1923)
[  206.510114] __folio_end_writeback (mm/page-writeback.c:2981)

This path is properly using xa_lock_irqsave() before calling
__xa_clear_mark().

[  206.520722] read to 0xffff963df6a90fe0 of 8 bytes by task 2793 on cpu 6:
[  206.520735] xas_find_marked (./include/linux/xarray.h:1706 lib/xarray.c:1354)
[  206.520750] filemap_get_folios_tag (mm/filemap.c:1975 mm/filemap.c:2273)

This takes the RCU read lock before calling xas_find_marked() as it's
supposed to.

What garbage do I have to write to tell KCSAN it's wrong?  The line
that's probably triggering it is currently:

                         unsigned long data = *addr & (~0UL << offset);

I don't think it is actually wrong in this case. You're accessing xarray
only with RCU protection so it can be changing under your hands. For
example the code in xas_find_chunk():

                         unsigned long data = *addr & (~0UL << offset);
                         if (data)
                                 return __ffs(data);

is prone to the compiler refetching 'data' from *addr after checking for
data != 0 and getting 0 the second time which would trigger undefined
behavior of __ffs(). So that code should definitely use READ_ONCE() to make
things safe.

BTW, find_next_bit() seems to need a similar treatment and in fact I'm not
sure why xas_find_chunk() has a special case for XA_CHUNK_SIZE ==
BITS_PER_LONG because find_next_bit() checks for that and handles that in a
fast path in the same way.

								Honza

Hi,

Thank you for your insight on the matter.

I guess you meant something like implementing this:

 include/linux/xarray.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index cb571dfcf4b1..1715fd322d62 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -1720,7 +1720,7 @@ static inline unsigned int xas_find_chunk(struct xa_state *xas, bool advance,
                offset++;
        if (XA_CHUNK_SIZE == BITS_PER_LONG) {
                if (offset < XA_CHUNK_SIZE) {
-                       unsigned long data = *addr & (~0UL << offset);
+                       unsigned long data = READ_ONCE(*addr) & (~0UL << offset);
                        if (data)
                                return __ffs(data);
                }


This apparently clears the KCSAN xas_find_marked() warning, so this might have been a data race after all.

Do you think we should escalate this to a formal patch?

Best regards,
Mirsad Todorovac



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux