Re: general protection fault, probably for non-canonical address in nfsd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sunday, June 7, 2020 10:32:44 AM CDT Hans-Peter Jansen wrote:
> Hi,
> 
> after upgrading the kernel from 5.6.11 to 5.6.14, we suffer from regular
> crashes of nfsd here:
> 
> 2020-06-07T01:32:43.600306+02:00 server rpc.mountd[2664]: authenticated
> mount request from 192.168.3.16:303 for /work (/work)
> 2020-06-07T01:32:43.602594+02:00 server rpc.mountd[2664]: authenticated
> mount request from 192.168.3.16:304 for /work/vmware (/work)
> 2020-06-07T01:32:43.602971+02:00 server rpc.mountd[2664]: authenticated
> mount request from 192.168.3.16:305 for /work/vSphere (/work)
> 2020-06-07T01:32:43.606276+02:00 server kernel: [51901.089211] general
> protection fault, probably for non-canonical address 0xb9159d506ba40000:
> 0000 [#1] SMP PTI 2020-06-07T01:32:43.606284+02:00 server kernel:
> [51901.089226] CPU: 1 PID: 3190 Comm: nfsd Tainted: G           O     
> 5.6.14-lp151.2-default #1 openSUSE Tumbleweed (unreleased)
> 2020-06-07T01:32:43.606286+02:00 server kernel: [51901.089234] Hardware
> name: System manufacturer System Product Name/P7F-E, BIOS 0906   
> 09/20/2010 2020-06-07T01:32:43.606287+02:00 server kernel: [51901.089247]
> RIP: 0010:cgroup_sk_free+0x26/0x80 2020-06-07T01:32:43.606288+02:00 server
> kernel: [51901.089257] Code: 00 00 00 00 66 66 66 66 90 53 48 8b 07 48 c7
> c3 30 72 07 b6 a8 01 75 07 48 85 c0 48 0f 45 d8 48 8b 83 18 09 00 00 a8 03
> 75 1a <65> 48 ff 08 f6 43 7c 01 74 02 5b c3 48 8b 43 18 a8 03 75 26 65 48
> 2020-06-07T01:32:43.606290+02:00 server kernel: [51901.089276] RSP:
> 0018:ffffb248c21e7e10 EFLAGS: 00010246 2020-06-07T01:32:43.606291+02:00
> server kernel: [51901.089280] RAX: b91603a504000000 RBX: ffff99ab141a0000
> RCX: 0000000000000021 2020-06-07T01:32:43.606292+02:00 server kernel:
> [51901.089284] RDX: ffffffffb6135ec4 RSI: 0000000000010080 RDI:
> ffff99a7159c1490 2020-06-07T01:32:43.606293+02:00 server kernel:
> [51901.089287] RBP: ffff99a7159c1200 R08: ffff99ab67a60c60 R09:
> 000000000002eb00 2020-06-07T01:32:43.606294+02:00 server kernel:
> [51901.089291] R10: ffffb248c0087dc0 R11: 00000000000000c6 R12:
> 0000000000000000 2020-06-07T01:32:43.606295+02:00 server kernel:
> [51901.089294] R13: 0000000000000103 R14: ffff99aae4934238 R15:
> ffff99ab31902000 2020-06-07T01:32:43.606296+02:00 server kernel:
> [51901.089299] FS:  0000000000000000(0000) GS:ffff99ab67a40000(0000)
> knlGS:0000000000000000 2020-06-07T01:32:43.606297+02:00 server kernel:
> [51901.089303] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 2020-06-07T01:32:43.606303+02:00 server kernel: [51901.089305] CR2:
> 00000000008e0000 CR3: 00000004df60a000 CR4: 00000000000026e0
> 2020-06-07T01:32:43.606304+02:00 server kernel: [51901.089307] Call Trace:
> 2020-06-07T01:32:43.606305+02:00 server kernel: [51901.089315] 
> __sk_destruct+0x10d/0x1d0 2020-06-07T01:32:43.606306+02:00 server kernel:
> [51901.089319]  inet_release+0x34/0x60 2020-06-07T01:32:43.606307+02:00
> server kernel: [51901.089325]  __sock_release+0x81/0xb0
> 2020-06-07T01:32:43.606308+02:00 server kernel: [51901.089358] 
> svc_sock_free+0x38/0x60 [sunrpc] 2020-06-07T01:32:43.606308+02:00 server
> kernel: [51901.089374]  svc_xprt_put+0x99/0xe0 [sunrpc]
> 2020-06-07T01:32:43.606310+02:00 server kernel: [51901.089389] 
> svc_recv+0x9c0/0xa40 [sunrpc] 2020-06-07T01:32:43.606310+02:00 server
> kernel: [51901.089410]  ? nfsd_destroy+0x60/0x60 [nfsd]
> 2020-06-07T01:32:43.606311+02:00 server kernel: [51901.089417] 
> nfsd+0xd1/0x150 [nfsd] 2020-06-07T01:32:43.606312+02:00 server kernel:
> [51901.089420]  kthread+0x10d/0x130 2020-06-07T01:32:43.606313+02:00 server
> kernel: [51901.089423]  ? kthread_park+0x90/0x90
> 2020-06-07T01:32:43.606314+02:00 server kernel: [51901.089426] 
> ret_from_fork+0x35/0x40
> 
> A vSphere 5.5 host accesses this linux server with nfs v3 for backup
> purposes (a Veeam backup server want to store a new backup here).
> 
> The kernel is tainted due to vboxdrv. The OS is openSUSE Leap 15.1,
> with the kernel and Virtualbox replaced with uptodate versions from
> proper rpm packages (built on that very vSphere host in a OBS server
> VM..).
> 
> I used to be subscribed to this ML, but that subscription has been
> lost 04/09, thus I cannot reply properly to the general prot. fault
> thread, started 05/12 from syzbot with Bruce looking into it.
> 
> It seems somewhat related.
> 
> Interestingly, we're using a couple of NFS v4 mounts for subsets of
> home here, and mount /work and other shares from various
> Tumbleweed systems with NFS v4 here without any undesired effects.
> 
> Since the kernel upgrade, every time, this Veeam thing triggers these
> v3 mounts, the crash happens. I've disabled this backup target for now
> until the problem is resolved, because it effectively prevents further
> nfs accesses to this server, and blocks our desktops until the server
> is rebooted.
> 
> A cursory look into 5.6.{15,16} changelogs seems to imply, that this
> issue is still pending.
> 
> Let me know, if I can provide any further info's.
> 
> Thanks,
> Pete

I see similar issues in Fedora kernels 5.6.14 through 5.6.16
https://bugzilla.redhat.com/show_bug.cgi?id=1839287

On the client I mount /home with sec=krb5p, and /mnt/koji with sec=krb5

-- 
Anthony - https://messinet.com
F9B6 560E 68EA 037D 8C3D  D1C9 FF31 3BDB D9D8 99B6

Attachment: signature.asc
Description: This is a digitally signed message part.


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux