Re: Kernel RIP 0010:cifs_flush_folio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 25, 2024 at 12:05 PM Ritvik Budhiraja
<budhirajaritviksmb@xxxxxxxxx> wrote:
>
> The test that failed was generic/074, with: Output mismatch;
> Write failed at offset 9933824, Write failed at offset 9961472,
> Write failed at offset 9950208. The kernel version for the machine
> was Ubuntu 22.04, 6.5.0-1018-azure
>
> The target server was Azure Files XNFS

Correction. The server for the test that generated this stack would be
Azure Files SMB. Not NFS.

>
> On Thu, 25 Apr 2024 at 11:53, Steve French <smfrench@xxxxxxxxx> wrote:
>>
>> That is plausible that it is the same bug as in the report.  What
>> kernel version is the xfstest failure on (and which xfstest)?
>>
>> Presumably this does not fail with recent kernels (e.g. 6.7 or later) correct?
>>
>> Since this is clone range (which not all servers support), what is the
>> target server (ksmbd? Samba on btrfs? Windows on REFS?)
>>
>> On Thu, Apr 25, 2024 at 1:14 AM Ritvik Budhiraja
>> <budhirajaritviksmb@xxxxxxxxx> wrote:
>> >
>> > Hi Steve,
>> > While investigating xnfstest results I came across the below kernel oops. I have seen this in some of the xfstest failures. I wanted to know if this is a known issue?
>> >
>> > I have identified a similar ubuntu bug:  Bug #2060919 “cifs: Copying file to same directory results in pa...” : Bugs : linux package : Ubuntu (launchpad.net)
>> >
>> > Reference dmesg logs:
>> > BUG: unable to handle page fault for address: fffffffffffffffe
>> > [Tue Apr 23 09:22:02 2024] #PF: supervisor read access in kernel mode
>> > [Tue Apr 23 09:22:02 2024] #PF: error_code(0x0000) - not-present page
>> > [Tue Apr 23 09:22:02 2024] PGD 19d43b067 P4D 19d43b067 PUD 19d43d067 PMD 0
>> > [Tue Apr 23 09:22:02 2024] Oops: 0000 [#68] SMP NOPTI
>> > [Tue Apr 23 09:22:02 2024] CPU: 1 PID: 3856364 Comm: fsstress Tainted: G      D            6.5.0-1018-azure #19~22.04.2-Ubuntu
>> > [Tue Apr 23 09:22:03 2024] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/28/2023
>> > [Tue Apr 23 09:22:03 2024] RIP: 0010:cifs_flush_folio+0x41/0xe0 [cifs]
>> > [Tue Apr 23 09:22:03 2024] Code: 49 89 cd 31 c9 41 54 53 48 89 f3 48 c1 ee 0c 48 83 ec 10 48 8b 7f 30 44 89 45 d4 e8 29 61 8e c6 49 89 c4 31 c0 4d 85 e4 74 7d <49> 8b 14 24 b8 00 10 00 00 f7 c2 00 00 01 00 74 12 41 0f b6 4c 24
>> > [Tue Apr 23 09:22:03 2024] RSP: 0018:ffffb182c3d3fcc0 EFLAGS: 00010282
>> > [Tue Apr 23 09:22:03 2024] RAX: 0000000000000000 RBX: 0000000011d00000 RCX: 0000000000000000
>> > [Tue Apr 23 09:22:03 2024] RDX: 0000000000000000 RSI: 0000000000011d00 RDI: ffffb182c3d3fc10
>> > [Tue Apr 23 09:22:03 2024] RBP: ffffb182c3d3fcf8 R08: 0000000000000001 R09: 0000000000000000
>> > [Tue Apr 23 09:22:03 2024] R10: 0000000011cfffff R11: 0000000000000000 R12: fffffffffffffffe
>> > [Tue Apr 23 09:22:03 2024] R13: ffffb182c3d3fd48 R14: ffff994311023c30 R15: ffffb182c3d3fd40
>> > [Tue Apr 23 09:22:03 2024] FS:  00007c82b3e10740(0000) GS:ffff9944b7d00000(0000) knlGS:0000000000000000
>> > [Tue Apr 23 09:22:03 2024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > [Tue Apr 23 09:22:03 2024] CR2: fffffffffffffffe CR3: 00000001acb52000 CR4: 0000000000350ee0
>> > [Tue Apr 23 09:22:03 2024] Call Trace:
>> > [Tue Apr 23 09:22:03 2024]  <TASK>
>> > [Tue Apr 23 09:22:03 2024]  ? show_regs+0x6a/0x80
>> > [Tue Apr 23 09:22:03 2024]  ? __die+0x25/0x70
>> > [Tue Apr 23 09:22:03 2024]  ? page_fault_oops+0x79/0x180
>> > [Tue Apr 23 09:22:03 2024]  ? srso_return_thunk+0x5/0x10
>> > [Tue Apr 23 09:22:03 2024]  ? search_exception_tables+0x61/0x70
>> > [Tue Apr 23 09:22:03 2024]  ? srso_return_thunk+0x5/0x10
>> > [Tue Apr 23 09:22:03 2024]  ? kernelmode_fixup_or_oops+0xa2/0x120
>> > [Tue Apr 23 09:22:03 2024]  ? __bad_area_nosemaphore+0x16f/0x280
>> > [Tue Apr 23 09:22:03 2024]  ? terminate_walk+0x97/0xf0
>> > [Tue Apr 23 09:22:03 2024]  ? bad_area_nosemaphore+0x16/0x20
>> > [Tue Apr 23 09:22:03 2024]  ? do_kern_addr_fault+0x62/0x80
>> > [Tue Apr 23 09:22:03 2024]  ? exc_page_fault+0xdb/0x160
>> > [Tue Apr 23 09:22:03 2024]  ? asm_exc_page_fault+0x27/0x30
>> > [Tue Apr 23 09:22:03 2024]  ? cifs_flush_folio+0x41/0xe0 [cifs]
>> > [Tue Apr 23 09:22:03 2024]  cifs_remap_file_range+0x16c/0x5e0 [cifs]
>> > [Tue Apr 23 09:22:03 2024]  do_clone_file_range+0x107/0x290
>> > [Tue Apr 23 09:22:03 2024]  vfs_clone_file_range+0x3f/0x120
>> > [Tue Apr 23 09:22:03 2024]  ioctl_file_clone+0x4d/0xa0
>> > [Tue Apr 23 09:22:03 2024]  do_vfs_ioctl+0x35c/0x860
>> > [Tue Apr 23 09:22:03 2024]  __x64_sys_ioctl+0x73/0xd0
>> > [Tue Apr 23 09:22:03 2024]  do_syscall_64+0x5c/0x90
>> > [Tue Apr 23 09:22:03 2024]  ? srso_return_thunk+0x5/0x10
>> > [Tue Apr 23 09:22:03 2024]  ? exc_page_fault+0x80/0x160
>> > [Tue Apr 23 09:22:03 2024]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>>
>>
>>
>> --
>> Thanks,
>>
>> Steve

I reviewed the launchpad bug. This problem seems to be well understood.
The problem seems to be well understood:
>>> Since the Ubuntu mantic kernel consumes both 6.1.y and 6.7.y / 6.8.y stable patches, this patch was applied to mantic's 6.5 kernel by mistake, and contains the wrong logic for how __filemap_get_folio() works in 6.5.
So the order of backport application seems to have led to this problem.

-- 
Regards,
Shyam





[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux