[BUG] filemap_get_read_batch()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I've hit the filemap_get_read_batch() BUG [1] that I think I saw Dave
had also recently reported. It looks like the problem is essentially a
race between reads, pagecache removal and folio reinsertion that leads
to an invalid folio pointer. E.g., what I observe is the following
(ordered) sequence of events:

Task A:
- Lands in filemap_get_read_batch() looking for a couple folio indexes,
  currently both populated by single page folios.
- Grabs the folio at the first index and starts to process it.

Task B:
- Invalidates several folios from the mapping, including both the
  aforementioned folios task A is after.

Task C: 
- Instantiates a compound (order 2) folio that covers both indexes being
  processed by task A.

Task A:
- Iterates to the next xarray index based on the (now already removed)
  non-compound folio via xas_advance()/xas_next().
- BUG splat down in folio_try_get_rcu() on the folio pointer..

I'm not quite sure what is being returned from the xarray here. It
doesn't appear to be another page or anything (i.e. a tail page of a
different folio sort of like we saw with the iomap writeback completion
issue). I just get more splats if I try to access it purely as a page,
so I'm not sure it's a pointer at all. I don't have enough context on
the xarray bits to intuit on whether it might be internal data or just
garbage if the node happened to be reformatted, etc. If you have any
thoughts on extra things to check around that I can try to dig further
into it..

In any event, it sort of feels like somehow or another this folio order
change peturbs the xarray iteration since IIUC the non-compound page
variant has been in place for a while, but that could just be wrong or
circumstance. I'm not sure if it's possible to check the xarray node for
such changes or whatever before attempting to process the returned entry
(and to preserve the lockless algorithm). FWIW wrapping the whole lookup
around an xa_lock_irq(&mapping->i_pages) lock cycle does make the
problem disappear.

Brian

[1]

BUG: kernel NULL pointer dereference, address: 0000000000000106
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 72 PID: 297881 Comm: xfs_io Tainted: G          I       5.19.0-rc2+ #160
Hardware name: Dell Inc. PowerEdge R740/01KPX8, BIOS 1.6.11 11/20/2018
RIP: 0010:filemap_get_read_batch+0x8e/0x240
Code: 81 ff 06 04 00 00 0f 84 f7 00 00 00 48 81 ff 02 04 00 00 0f 84 c4 00 00 00 48 39 6c 24 08 0f 87 84 00 00 00 40 f6 c7 01 75 7e <8b> 47 34 85 c0 0f 84 a8 00 00 00 8d 50 01 48 8d 77 34 f0 0f b1 57
RSP: 0018:ffffacdf200d7c28 EFLAGS: 00010246
RAX: 0000000000000039 RBX: ffffacdf200d7d68 RCX: 0000000000000034
RDX: ffff9b2aa6805220 RSI: 0000000000000074 RDI: 00000000000000d2
RBP: 0000000000000075 R08: 0000000000000402 R09: ffff9b2a89ac4488
R10: 0000000000020000 R11: 0000000000000000 R12: ffff9b2a89ac4600
R13: 0000000000000075 R14: 0000000000000074 R15: ffffacdf200d7e88
FS:  00007fbba6ce7800(0000) GS:ffff9b29c1100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000106 CR3: 0000000150c7e001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 filemap_get_pages+0x80/0x710
 ? current_time+0x1b/0xd0
 ? atime_needs_update+0xfc/0x170
 ? touch_atime+0x27/0x190
 filemap_read+0xa8/0x310
 ? __folio_start_writeback+0x91/0x2d0
 ? folio_add_lru+0x8d/0x100
 ? _raw_spin_unlock+0x15/0x30
 ? __handle_mm_fault+0xd13/0xf50
 xfs_file_buffered_read+0x50/0xd0 [xfs]
 xfs_file_read_iter+0x70/0xd0 [xfs]
 new_sync_read+0xf6/0x160
 vfs_read+0x138/0x190
 __x64_sys_pread64+0x6e/0xa0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7fbba71fa1ef
Code: 08 89 3c 24 48 89 4c 24 18 e8 2d f4 ff ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 7d f4 ff ff 48 8b
RSP: 002b:00007ffcdc03d0d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fbba71fa1ef
RDX: 0000000000001000 RSI: 000056021a601000 RDI: 0000000000000003
RBP: 0000000000074000 R08: 0000000000000000 R09: 00007fbba7140a60
R10: 0000000000074000 R11: 0000000000000293 R12: 0000000000074000
R13: 000000000002c000 R14: 00000000000a0000 R15: 0000000000001000




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux