Re: [PATCH v15 00/26] nfs/nfsd: add support for LOCALIO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 06, 2024 at 03:31:41PM -0400, Anna Schumaker wrote:
> Hi Mike,
> 
> On 8/31/24 6:37 PM, Mike Snitzer wrote:
> > Hi,
> > 
> > Happy Labor Day weekend (US holiday on Monday)!  Seems apropos to send
> > what I hope the final LOCALIO patchset this weekend: its my birthday
> > this coming Tuesday, so _if_ LOCALIO were to get merged for 6.12
> > inclusion sometime next week: best b-day gift in a while! ;)
> > 
> > Anyway, I've been busy incorporating all the review feedback from v14
> > _and_ working closely with NeilBrown to address some lingering net-ns
> > refcounting and nfsd modules refcounting issues, and more (Chnagelog
> > below):
> > 
> 
> I've been running tests on localio this afternoon after finishing up going through v15 of the patches (I was most of the way through when you posted v16, so I haven't updated yet!). Cthon tests passed on all NFS versions, and xfstests passed on NFS v4.x. However, I saw this crash from xfstests with NFS v3:
> 
> [ 1502.440896] run fstests generic/633 at 2024-09-06 14:04:17
> [ 1502.694356] process 'vfstest' launched '/dev/fd/4/file1' with NULL argv: empty string added
> [ 1502.699514] Oops: general protection fault, probably for non-canonical address 0x6c616e69665f6140: 0000 [#1] PREEMPT SMP NOPTI
> [ 1502.700970] CPU: 3 UID: 0 PID: 513 Comm: nfsd Not tainted 6.11.0-rc6-g0c79a48cd64d-dirty+ #42323 70d41673e6cbf8e3437eb227e0a9c3c46ed3b289
> [ 1502.702506] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
> [ 1502.703593] RIP: 0010:nfsd_cache_lookup+0x2b3/0x840 [nfsd]
> [ 1502.704474] Code: 8d bb 30 02 00 00 bb 01 00 00 00 eb 12 49 8d 46 10 48 8b 08 ff c3 48 85 c9 0f 84 9c 00 00 00 49 89 ce 4c 8d 61 c8 41 8b 45 00 <3b> 41 c8 75 1f 41 8b 45 04 41 3b 46 cc 74 15 8b 15 2c c6 b8 f2 be
> [ 1502.706931] RSP: 0018:ffffc27ac0a2fd18 EFLAGS: 00010206
> [ 1502.707547] RAX: 00000000b95691f7 RBX: 0000000000000002 RCX: 6c616e69665f6178
> [ 1502.708311] RDX: 0000000000000034 RSI: ffffa0f8a652a780 RDI: ffffa0f8c04cfb00
> [ 1502.709055] RBP: ffffa0f8827b2ba0 R08: 0000000000000000 R09: ffffa0f8c04cfb00
> [ 1502.709728] R10: 000000000000009c R11: ffffffffc0c77ef0 R12: 6c616e69665f6140
> [ 1502.710382] R13: ffffa0f8c04cfb00 R14: 6c616e69665f6178 R15: ffffa0f883d4e230
> [ 1502.710982] FS:  0000000000000000(0000) GS:ffffa0f8fbd80000(0000) knlGS:0000000000000000
> [ 1502.711645] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1502.712087] CR2: 00007f2c4d1ed640 CR3: 0000000117a1e000 CR4: 0000000000750ef0
> [ 1502.712615] PKRU: 55555554
> [ 1502.712804] Call Trace:
> [ 1502.712979]  <TASK>
> [ 1502.713131]  ? __die_body+0x6a/0xb0
> [ 1502.713372]  ? die_addr+0xa4/0xd0
> [ 1502.713583]  ? exc_general_protection+0x16c/0x210
> [ 1502.713880]  ? asm_exc_general_protection+0x26/0x30
> [ 1502.714164]  ? __pfx_nfs3svc_decode_sattrargs+0x10/0x10 [nfsd a9c12e0cc9647b021c55f7745e60fc1cbe54674a]
> [ 1502.714700]  ? nfsd_cache_lookup+0x2b3/0x840 [nfsd a9c12e0cc9647b021c55f7745e60fc1cbe54674a]
> [ 1502.715156]  ? nfsd_cache_lookup+0x2e7/0x840 [nfsd a9c12e0cc9647b021c55f7745e60fc1cbe54674a]
> [ 1502.715590]  nfsd_dispatch+0x93/0x210 [nfsd a9c12e0cc9647b021c55f7745e60fc1cbe54674a]
> [ 1502.715997]  svc_process_common+0x324/0x680 [sunrpc 2f7328527f188558dea7880294960ba75bb09c81]
> [ 1502.716439]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd a9c12e0cc9647b021c55f7745e60fc1cbe54674a]
> [ 1502.716873]  svc_process+0x117/0x1c0 [sunrpc 2f7328527f188558dea7880294960ba75bb09c81]
> [ 1502.717276]  svc_recv+0xabf/0xc00 [sunrpc 2f7328527f188558dea7880294960ba75bb09c81]
> [ 1502.717674]  nfsd+0xc5/0x100 [nfsd a9c12e0cc9647b021c55f7745e60fc1cbe54674a]
> [ 1502.718225]  ? __pfx_nfsd+0x10/0x10 [nfsd a9c12e0cc9647b021c55f7745e60fc1cbe54674a]
> [ 1502.718641]  kthread+0xe9/0x110
> [ 1502.718798]  ? __pfx_kthread+0x10/0x10
> [ 1502.718979]  ret_from_fork+0x37/0x50
> [ 1502.719154]  ? __pfx_kthread+0x10/0x10
> [ 1502.719335]  ret_from_fork_asm+0x1a/0x30
> [ 1502.719525]  </TASK>
> [ 1502.719636] Modules linked in: nfsv3 overlay cbc cts rpcsec_gss_krb5 nfsv4 nfs rpcrdma rdma_cm iw_cm ib_cm cfg80211 ib_core rfkill 8021q garp stp mrp llc vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core intel_vsec pmt_telemetry pmt_class kvm_intel kvm snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg crct10dif_pclmul crc32_pclmul snd_hda_codec polyval_clmulni polyval_generic ghash_clmulni_intel snd_hwdep sha512_ssse3 snd_hda_core sha256_ssse3 sha1_ssse3 iTCO_wdt snd_pcm intel_pmc_bxt iTCO_vendor_support aesni_intel snd_timer gf128mul snd psmouse crypto_simd i2c_i801 cryptd joydev pcspkr rapl lpc_ich i2c_smbus soundcore mousedev mac_hid nfsd nfs_acl lockd auth_rpcgss grace nfs_localio sunrpc usbip_host dm_mod usbip_core loop nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vmw_vmci vsock qemu_fw_cfg ip_tables x_tables hid_generic usbhid xfs libcrc32c crc32c_generic serio_raw atkbd libps2 virtio_net vivaldi_fmap virtio_gpu virtio_console
> [ 1502.719684]  net_failover virtio_blk crc32c_intel i8042 failover virtio_rng xhci_pci intel_agp virtio_balloon xhci_pci_renesas virtio_dma_buf serio intel_gtt
> [ 1502.724436] ---[ end trace 0000000000000000 ]---
> 
> Please let me know if there are any other details you need about my setup to help debug this!

Hmm, I haven't seen this issue, my runs of xfstests with LOCALIO
enabled look solid:
https://evilpiepirate.org/~testdashboard/ci?user=snitzer&branch=snitm-nfs-next&test=^fs.nfs.fstests.generic.633$

And I know Chuck has been testing xfstests and more with the patches
applied but LOCALIO disabled in his kernel config.

The stack seems to indicate nfsd is just handling a request (so it
isn't using LOCALIO, at least not for this op).

Probably best if you do try v16.  v15 has issues v16 addressed.  If
you can reproduce with v16 please share your kernel .config and
xfstests config. 

Note that I've only really tested my changes against v6.11-rc4.  But I
can rebase on v6.11-rc6 if you find v16 still fails for you.

Thanks,
Mike





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux