Re: KCSAN: data-race in __xa_clear_mark / xas_find_marked

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+Cc XArray maintainer]

Hi Matthew,

On Mon, Aug 10, 2020 at 05:41AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    fc80c51f Merge tag 'kbuild-v5.9' of git://git.kernel.org/p..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13cb73fa900000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=997a92ee4b5588ef
> dashboard link: https://syzkaller.appspot.com/bug?extid=0d4522639ba75b02bf19
> compiler:       clang version 11.0.0 (https://github.com/llvm/llvm-project.git ca2dcbd030eadbf0aa9b660efe864ff08af6e18b)
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+0d4522639ba75b02bf19@xxxxxxxxxxxxxxxxxxxxxxxxx
> 
> ==================================================================
> BUG: KCSAN: data-race in __xa_clear_mark / xas_find_marked
> 
> write to 0xffff8880bace9b30 of 8 bytes by interrupt on cpu 1:
>  instrument_write include/linux/instrumented.h:42 [inline]
>  __test_and_clear_bit include/asm-generic/bitops/instrumented-non-atomic.h:85 [inline]
>  node_clear_mark lib/xarray.c:100 [inline]
>  xas_clear_mark lib/xarray.c:908 [inline]
>  __xa_clear_mark+0x229/0x350 lib/xarray.c:1726
>  test_clear_page_writeback+0x28d/0x480 mm/page-writeback.c:2739
>  end_page_writeback+0xa7/0x110 mm/filemap.c:1369
>  page_endio+0x1aa/0x1e0 mm/filemap.c:1400
>  mpage_end_io+0x186/0x1d0 fs/mpage.c:54
>  bio_endio+0x28a/0x370 block/bio.c:1447
>  req_bio_endio block/blk-core.c:259 [inline]
>  blk_update_request+0x535/0xbd0 block/blk-core.c:1576
>  blk_mq_end_request+0x22/0x50 block/blk-mq.c:562
>  lo_complete_rq+0xca/0x180 drivers/block/loop.c:500
>  blk_done_softirq+0x1a5/0x200 block/blk-mq.c:586
>  __do_softirq+0x198/0x360 kernel/softirq.c:298
>  run_ksoftirqd+0x2f/0x60 kernel/softirq.c:652
>  smpboot_thread_fn+0x347/0x530 kernel/smpboot.c:165
>  kthread+0x20d/0x230 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> 
> read to 0xffff8880bace9b30 of 8 bytes by task 12715 on cpu 0:
>  xas_find_chunk include/linux/xarray.h:1625 [inline]
>  xas_find_marked+0x22f/0x6b0 lib/xarray.c:1198
>  find_get_pages_range_tag+0xa3/0x580 mm/filemap.c:1976
>  pagevec_lookup_range_tag+0x37/0x50 mm/swap.c:1120
>  __filemap_fdatawait_range+0xab/0x1b0 mm/filemap.c:519
>  filemap_fdatawait_range mm/filemap.c:554 [inline]
>  filemap_write_and_wait_range+0x119/0x2a0 mm/filemap.c:664
>  generic_file_read_iter+0x11d/0x3e0 mm/filemap.c:2375
>  call_read_iter include/linux/fs.h:1866 [inline]
>  generic_file_splice_read+0x22b/0x310 fs/splice.c:312
>  do_splice_to fs/splice.c:870 [inline]
>  splice_direct_to_actor+0x2a8/0x660 fs/splice.c:950
>  do_splice_direct+0xf2/0x170 fs/splice.c:1059
>  do_sendfile+0x56a/0xba0 fs/read_write.c:1540
>  __do_sys_sendfile64 fs/read_write.c:1595 [inline]
>  __se_sys_sendfile64 fs/read_write.c:1587 [inline]
>  __x64_sys_sendfile64+0xa9/0x130 fs/read_write.c:1587
>  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 PID: 12715 Comm: syz-executor.4 Not tainted 5.8.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> ==================================================================

We had a discussion around this earlier this year:

	https://lore.kernel.org/lkml/20200305151831.GM29971@xxxxxxxxxxxxxxxxxxxxxx/#t

where you mentioned:

> - If a bit was set before and after the modification, it must be seen to
>   be set.
> - If a bit was clear before and after the modification, it must be seen to
>   be clear.
> - If a bit is modified, it may be seen as set or clear.

Do the atomic bitops satisfy those criteria?
(Though there were still some issues around find_next_bit(), but maybe
we can fix that?)

In general, we're wondering what is required to address this properly.

[ Note: There are a bunch more syzbot reports, which can be treated as
  duplicates, and haven't been sent to LKML:
	https://syzkaller.appspot.com/bug?id=b3f09ccd19880d00592d1692ae3bfe5933fa2b86
	https://syzkaller.appspot.com/bug?id=783c9bf4ad668f022c60e9b12bd8ce9974c1512a
	https://syzkaller.appspot.com/bug?id=711fd5ad665157363e7a21df0c3808884ebeabb9
	https://syzkaller.appspot.com/bug?id=cd60a83c9ff17c293fbb51355cf7b2f0420c4e0e
	https://syzkaller.appspot.com/bug?id=4b16c74b38549b01920b73e5f2df53be5e8dae75
	https://syzkaller.appspot.com/bug?id=7df642f4aa1c195834b4687ed3a9f18cd7f12ae8 ]

Thanks,
-- Marco




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux