Re: kernel BUG at /build/buildd/linux-3.2.0/fs/lockd/clntxdr.c:226!

Chuck Lever <chuck.lever@xxxxxxxxxx> · Sun, 14 Oct 2012 16:55:44 -0400

On Oct 14, 2012, at 3:39 PM, Bruce Fields <bfields@xxxxxxxxxxxx> wrote:

> On Sat, Oct 13, 2012 at 02:28:39AM +0000, Myklebust, Trond wrote:
>> On Sat, 2012-10-13 at 10:02 +0900, Linus Torvalds wrote:
>>> On Sat, Oct 13, 2012 at 9:21 AM, Larry McVoy <lm@xxxxxxxxxxxx> wrote:
>>>> 
>>>> Ahh, I've been away from the kernel too long.  I miss that delicate
>>>> management touch.
>>> 
>>> "Delicate Management Touch" is my middle name.
>>> 
>>>> pics of the stack trace at http://www.mcvoy.com/lm/nfs-lock-crash
>>> 
>>> Ok, that's just the normal kind of random left-over oopses due to
>>> subsequent problems of a BUG_ON(). Looks like the watchdog timer ends
>>> up being unhappy, almost certainly simply because some core filesystem
>>> spinlock not being released.
>>> 
>>> It used to be (a long long time ago) that we'd recover fairly
>>> gracefully from BUG_ON()'s - back when the main shared lock we had was
>>> the kernel lock, and we had a single per-process kernel lock counter.
>>> So when we killed the process, we could clean that single lock up.
>>> 
>>> These days, if some process dies in random kernel code due to a
>>> BUG_ON() or a wild pointer or similar, and we kill it, we are seldom
>>> able to do so cleanly. So the best we can hope for is that it happened
>>> in some context where it held no (important) locks. Which is rare. So
>>> BUG_ON()'s are often fatal, and there are these kinds of downstream
>>> problems where they get flushed off the screen by subsequent issues...
>> 
>> If that code is being called under a lock, then we have other problems.
>> It is standard XDR code: it should always be called from an ordinary
>> process context with no special locks being held by the callers.
>> 
>>> Ho humm. Google doesn't seem to be finding any similar bug-reports, so
>>> unless Bruce or Trond go "Ahh, I know what it's about", I do think we
>>> would want to get as much more info as possible.
>> 
>> Never seen it before, and I see no reason why it should drag the entire
>> box down with it. It is part of the NLM server's callback code, so there
>> is no chance of it being called as part of a memory reclaim or anything
>> similarly sensitive to the rest of the box.
>> 
>> Are we sure that this BUG_ON() really is top of the chain of Oopses
>> here? All I can see it doing is crashing the lockd server process,
> 
> Can't it be called from the rpciod workqueue?  I'm not sure what happens
> when we hit a BUG there.
> 
> It looks like a bunch of BUG_ON's got added with an xdr rewrite in
> 2b061f9ef216b6d229b06267f188167fd6ab3d9b.  Maybe Chuck or someone should
> do a 'git grep BUG fs/lockd' and figure out what those should be
> instead?

In my own defense, I also removed a lot of BUG_ON's in that series.

This particular one was added because "getting the endianness of the reply status code wrong" turns out to be a fairly easy problem to introduce without noticing it when making changes to NFSD or lockd server.  There is a set of status codes that are already XDR-encoded, and a set that are not.  A person can easily choose the wrong one to use.

I expected that such problems would be caught quickly during development if we actually checked for it in the XDR layer and barked if it is incorrect.  I assumed therefore that these assertions would not be encountered by the time code gets in front of users.

I think range-check assertions in the XDR code are valuable.  Whether they are done via BUG_ON or WARN_ON_ONCE is a matter of priority: BUG_ON forces you to notice the problem and address it, while WARN_ON allows the system to continue operating with the bug, but the bug can be ignored (or the WARN_ON simply removed because it is annoying).

> And I need to do the same for nfsd; I've been sloppy about using them as
> asserts.
> 
> --b.
> 
>> which
>> will seriously inconvenience all the NFS clients trying to do locking,
>> but it shouldn't be affecting the swapper process as we're seeing in the
>> Oops screenshots.
>> If it really is the first thing to Oops, then the only thing I can think
>> of there that would trigger other Oopses would be a memory corruption
>> (use after free or some such thing?). Perhaps Larry could try turning on
>> some of the less intrusive slab debugging options?
>> 
>>> Doing a kernel compile really isn't that bad. The only nasty piece is
>>> getting the kernel configuration right, but you can just use the
>>> distro config. It's much too big and contains everything, but it will
>>> work, and gets you as similar a kernel as possible. Of course, Ubuntu
>>> has made installing your own kernel stupidly complicated (you have to
>>> build a package and install it using the package manager), but while
>>> it's an annoying extra step or two (compared to just doing a "make
>>> modules_install install"), it's not rocket surgery. There's a few help
>>> pages for it:
>>> 
>>>    https://help.ubuntu.com/community/Kernel/Compile
>>> 
>>> being the first one.
>>> 
>>>                Linus
>> 
>> -- 
>> Trond Myklebust
>> Linux NFS client maintainer
>> 
>> NetApp
>> Trond.Myklebust@xxxxxxxxxx
>> www.netapp.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html