Re: kernel ops from commit 6869c46bd960

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2021-03-30 at 17:01 +0800, Eryu Guan wrote:
> On Mon, Mar 29, 2021 at 09:26:24PM +0000, Trond Myklebust wrote:
> > Hi Olga,
> > 
> > On Mon, 2021-03-29 at 17:11 -0400, Olga Kornievskaia wrote:
> > > I'm on commit 20e0d860a4217b8e6a2f72852a5d6465e6104078 of your
> > > origin/testing
> > > 
> > > I just did mount -o vers=3,sec=sys <linux_server>:/ /mnt
> > > 
> > > Got the following oops.
> > > 
> > > I believe I bisected it to the following commit:
> > > 6869c46bd9607787f2f39dabf59da8f34dd3f513 "nfs: hornor timeo and
> > > retrans option when mounting NFSv3"
> > > 
> > > cb76aa233c4d060b2daa8077a5dc0f414ca682c1 "SUNRPC: Ensure the
> > > transport
> > > backchannel association"
> > > 
> > > [66946.322155] kernel BUG at fs/nfs/client.c:492!
> > > [66946.323863] invalid opcode: 0000 [#1] SMP KASAN PTI
> > > [66946.325630] CPU: 0 PID: 69573 Comm: mount.nfs Tainted:
> > > G        W
> > >       5.12.0-rc4+ #86
> > > [66946.328195] Hardware name: VMware, Inc. VMware Virtual
> > > Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
> > > [66946.331367] RIP: 0010:nfs_init_timeout_values+0x104/0x110
> > > [nfs]
> > > [66946.333501] Code: 00 76 a3 49 c7 45 00 c0 27 09 00 bb c0 27 09
> > > 00
> > > eb 94 e8 8f a8 f5 d7 41 bc 03 00 00 00 41 c7 45 18 02 00 00 00 e9
> > > 65
> > > ff ff ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48
> > > b8
> > > 00
> > > 00 00
> > > [66946.339146] RSP: 0018:ffff888017c17908 EFLAGS: 00010287
> > > [66946.340624] RAX: 0000000000000000 RBX: ffffffffffffff9c RCX:
> > > ffffffffc15215a7
> > > [66946.342576] RDX: dffffc0000000000 RSI: 0000000000000011 RDI:
> > > ffffffffc158d6a0
> > > [66946.344483] RBP: 00000000ffffffff R08: 6d01a8c000000002 R09:
> > > 0000000000000000
> > > [66946.346391] R10: 6d01a8c000000002 R11: 0000000000000000 R12:
> > > 00000000ffffffff
> > > [66946.348367] R13: ffffffffc158d6a0 R14: 0000000000000011 R15:
> > > ffff888077a97820
> > > [66946.350280] FS:  00007f50b9827880(0000)
> > > GS:ffff888059600000(0000)
> > > knlGS:0000000000000000
> > > [66946.352440] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [66946.354087] CR2: 000055aa60e6b000 CR3: 000000000d7e8004 CR4:
> > > 00000000001706f0
> > > [66946.356261] Call Trace:
> > > [66946.356943]  nfs_mount+0x297/0x470 [nfs]
> > > [66946.358041]  ? mnt_xdr_dec_mountres+0x130/0x130 [nfs]
> > > [66946.359515]  ? ip_map_cache_destroy+0x80/0x80 [sunrpc]
> > > [66946.361471]  nfs_request_mount.constprop.17+0x205/0x310 [nfs]
> > > [66946.363036]  ? nfs_show_stats+0x7d0/0x7d0 [nfs]
> > > [66946.364352]  ? avc_has_extended_perms+0x760/0x760
> > > [66946.365772]  nfs_try_get_tree+0x18d/0x490 [nfs]
> > > [66946.367058]  ? nfs_get_tree_common+0x690/0x690 [nfs]
> > > [66946.368447]  ? cred_has_capability+0xf4/0x1e0
> > > [66946.369655]  ? _raw_spin_lock+0x7a/0xd0
> > > [66946.370889]  ? _raw_write_lock_bh+0xe0/0xe0
> > > [66946.372022]  ? __kmalloc_track_caller+0x136/0x450
> > > [66946.373365]  ? try_module_get+0x40/0xe0
> > > [66946.374440]  ? get_nfs_version+0x29/0x80 [nfs]
> > > [66946.375711]  ? nfs_get_tree+0x7ca/0xa20 [nfs]
> > > [66946.376982]  vfs_get_tree+0x45/0x120
> > > [66946.377966]  path_mount+0x914/0xd30
> > > [66946.378976]  ? __check_object_size+0x178/0x220
> > > [66946.380201]  ? finish_automount+0x2f0/0x2f0
> > > [66946.381304]  ? strncpy_from_user+0x1e4/0x250
> > > [66946.382584]  ? getname_flags+0x10d/0x2a0
> > > [66946.383642]  ? call_rcu+0x273/0x870
> > > [66946.384675]  do_mount+0xcb/0xf0
> > > [66946.385531]  ? path_mount+0xd30/0xd30
> > > [66946.386524]  ? _copy_from_user+0x4c/0x90
> > > [66946.387605]  ? copy_mount_options+0x59/0x100
> > > [66946.388821]  __x64_sys_mount+0xf4/0x110
> > > [66946.389860]  do_syscall_64+0x33/0x40
> > > [66946.390865]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > [66946.392225] RIP: 0033:0x7f50b8cb79ee
> > > [66946.393215] Code: 48 8b 0d 9d f4 2b 00 f7 d8 64 89 01 48 83 c8
> > > ff
> > > c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00
> > > 00
> > > 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6a f4 2b 00 f7 d8
> > > 64
> > > 89
> > > 01 48
> > > [66946.398172] RSP: 002b:00007ffe53753188 EFLAGS: 00000246
> > > ORIG_RAX:
> > > 00000000000000a5
> > > [66946.400138] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
> > > 00007f50b8cb79ee
> > > [66946.402202] RDX: 000055aa60e4a2b0 RSI: 000055aa60e4a290 RDI:
> > > 000055aa60e484d0
> > > [66946.404306] RBP: 00007ffe537533a0 R08: 000055aa60e4d180 R09:
> > > 000055aa60e4d170
> > > [66946.406418] R10: 0000000000000000 R11: 0000000000000246 R12:
> > > 00007f50b98277f8
> > > [66946.408502] R13: 00007ffe537533a0 R14: 00007ffe53753280 R15:
> > > 000055aa60e4d140
> > > [66946.410600] Modules linked in: nfsv3 nfs_acl
> > > nfs_layout_nfsv41_files rpcsec_gss_krb5 nfsv4 dns_resolver nfs
> > > lockd
> > > grace nfs_ssc fuse rfcomm xt_conntrack nf_conntrack
> > > nf_defrag_ipv6
> > > nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 nft_counter nft_compat
> > > nf_tables nfnetlink tun bridge stp llc vmw_vsock_vmci_transport
> > > vsock
> > > bnep snd_seq_midi snd_seq_midi_event intel_rapl_msr
> > > intel_rapl_common
> > > crct10dif_pclmul crc32_pclmul vmw_balloon ghash_clmulni_intel
> > > joydev
> > > pcspkr btusb uvcvideo btrtl btbcm btintel videobuf2_vmalloc
> > > snd_ens1371 videobuf2_memops videobuf2_v4l2 snd_ac97_codec
> > > ac97_bus
> > > videobuf2_common snd_seq bluetooth videodev snd_pcm rfkill mc
> > > ecdh_generic ecc snd_timer snd_rawmidi snd_seq_device snd
> > > soundcore
> > > i2c_piix4 vmw_vmci auth_rpcgss sunrpc ip_tables xfs libcrc32c
> > > sr_mod
> > > cdrom sg ata_generic crc32c_intel vmwgfx drm_kms_helper serio_raw
> > > syscopyarea sysfillrect sysimgblt fb_sys_fops cec ttm nvme
> > > nvme_core
> > > t10_pi ahci libahci ata_piix drm vmxnet3 libata
> > > [66946.437026] ---[ end trace b2ce7b83b0ed50dc ]---
> > 
> > OK. Why is mnt_timeout declared to be a static variable in that
> > patch?
> > Does the mount succeed if you let it be a stack variable?
> 
> I followed what nfs_umount() does. And the BUG() has no nothing to do
> with the mnt_timeout variable.

Fair enough, but using a 'static' declaration there is still a bug, and
is a reason to reject the patch. Do you mind if I just edit that away.

> The BUG() happens in nfs_init_timeout_values(), as the protocol is
> UDP,
> but CONFIG_NFS_DISABLE_UDP_SUPPORT. I was testing with UDP enabled,
> so
> didn't hit this crash.

I see. That probably explains why I didn't hit it either.

> The attached patch fixed the issue for me, but I'm not sure if it's
> the
> proper place to update mount protocol (e.g. another option would be
> updating it when parsing mount options).

That BUG() was originally there to catch parsing errors, which is not
the case here. We should just undo that #ifdef in
nfs_init_timeout_values().

I'll add a patch.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux