Re: [PATCH/rfc v2] NFS: introduce NFS namespaces.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 7 Jul 2021 at 02:12, NeilBrown <neilb@xxxxxxx> wrote:
>
> On Wed, 07 Jul 2021, Daire Byrne wrote:
> > On Sun, 4 Jul 2021 at 00:03, NeilBrown <neilb@xxxxxxx> wrote:
> > > > [  360.481824] ------------[ cut here ]------------
> > > > [  360.483141] kernel BUG at mm/slub.c:4205!
> > >
> > > Thanks for testing!
> > >
> > > It misunderstood the use of kfree_const().  It doesn't work for
> > > constants in modules, only constants in vmlinux.  So I guess you built
> > > nfs as a module.
> > >
> > > This version should fix that.
> > >
> > > Thanks,
> > > NeilBrown
> >
> > Yep, that was the issue and the latest patch certainly helped. I ran a
> > few load tests and everything seemed to be working fine.
> >
> > However, once I tried mounting the same server again using a different
> > namespace, I got a different looking crash under moderate load. I am
> > pretty sure I applied your latest patch correctly, but I'll double
> > check. I should probably remove some of the other patches I have
> > applied too.
> >
> > # mount -o vers=4.2 server:/srv/export /mnt/server1
> > # mount -o vers=4.2,namespace=server2 server:/srv/export /mnt/server2
> >
> > [ 3626.638077] general protection fault, probably for non-canonical
> > address 0x375f656c6966ff00: 0000 [#1] SMP PTI
> > [ 3626.640538] CPU: 9 PID: 12053 Comm: ls Not tainted 5.13.0-1.dneg.x86_64 #1
> > [ 3626.642270] Hardware name: Red Hat dneg, BIOS
> > 1.11.1-4.module_el8.2.0+320+13f867d7 04/01/2014
> > [ 3626.644443] RIP: 0010:__kmalloc_track_caller+0xfa/0x480
> > [ 3626.646138] Code: 65 4c 03 05 28 4d d5 69 49 83 78 10 00 4d 8b 20
> > 0f 84 4c 03 00 00 4d 85 e4 0f 84 43 03 00 00 41 8b 47 28 49 8b 3f 48
> > 8d 4a 01 <49> 8b 1c 04 4c 89 e0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 41
> > 8b 47
> > [ 3626.650253] RSP: 0018:ffffaadecf2afb90 EFLAGS: 00010206
> > [ 3626.651747] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000003d41
> > [ 3626.653479] RDX: 0000000000003d40 RSI: 0000000000000cc0 RDI: 000000000002fbe0
> > [ 3626.655293] RBP: ffffaadecf2afbd0 R08: ffff985aabc6fbe0 R09: ffff985689c76b20
> > [ 3626.657034] R10: ffff9858408a0000 R11: ffff985966e69ec0 R12: 375f656c6966ff00
> > [ 3626.658794] R13: 0000000000000000 R14: 0000000000000cc0 R15: ffff985680042200
>
> The above Code: shows the crash happens at
>
>   2a:*  49 8b 1c 04             mov    (%r12,%rax,1),%rbx               <-- trapping instruction
>
> and %r12 (which should be a memory address) is 375f656c6966ff00, which
> contains ASCII "file_7".
> So my guess is that a file name was copied into a buffer that had
> already been freed.
> This could be caused by a malloc bug somewhere else, but as the crash
> was in readdir code, and shows evidence of a file name, it seems likely
> that the bug is near by.  Do you have patches to anything that works
> with file names?
>
> NeilBrown

I stripped out all my patches so it's just this one on top of 5.13-rc7
and I can still reproduce it.

I can only trigger it by mounting the same export (RHEL7 server) using
two different namespaces and performing a heavy IO benchmark to either
mount (leaving one idle). Part of the benchmark walks thousands of
dirs with files (hence the readdirs).

If I mount the same server twice with no (same) namespaces, even with
the patch applied, it works fine without any crash.

Daire



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux