Re: WARNING: at lib/debugobjects.c:262 debug_print_object+0x8c/0xb0()

Jeff Layton <jlayton@xxxxxxxxxx> · Tue, 24 Jan 2012 12:43:53 -0500

On Tue, 24 Jan 2012 11:32:34 -0500
Jeff Layton <jlayton@xxxxxxxxxx> wrote:

> On Tue, 24 Jan 2012 17:01:29 +0200
> Boaz Harrosh <bharrosh@xxxxxxxxxxx> wrote:
> 
> > On 01/24/2012 02:36 PM, Jeff Layton wrote:
> > > 
> > > No, I don't think the state would be undefined after
> > > cancel_delayed_work_sync. In principle you could requeue that work
> > > again if you like without needing to reinitialize it.
> > > 
> > > I think this is a problem in the debugobjects code. It doesn't have
> > > any way to know that when the object is recycled out of the slab that
> > > the work is already initialized.
> > > 
> > 
> > The only difference between your above example of requeue after
> > cancel_delayed_work_sync, and this here is the visit back to the
> > slab. Does the slab (Maybe in debug mode) stumps over some of the
> > record memory?
> > 
> > If the memory is constant what is then the difference between the two
> > cases?
> > 
> > > Certainly it's simple enough to reinitialize the work every time we
> > > allocate an inode here, but I don't think this is really a rpc_pipefs
> > > bug per-se. 
> > 
> > That depends on the API intention. If an init is intended after
> > SLAB free then yes if not then not. We should ask for the intention
> > of this API.
> > 
> > > I can send a patch that works around this problem, but
> > > if there are plans to fix this in the debugobjects code, I won't
> > > bother...
> > > 
> > 
> > You mean other fix then calling INIT_DELAYED_WORK? is that so
> > bad that we need more code to avoid it?
> > 
> 
> I'm not opposed to a patch that sidesteps this problem, but I want to
> make sure we understand it so that we don't get bitten by it in other
> places. That's a good point. I hadn't considered whether memory
> poisoning is a factor. In the kernel I was testing:
> 
> CONFIG_SLUB=y
> CONFIG_SLUB_DEBUG_ON=y
> 
> ...just to be sure:
> 
> # cat /sys/kernel/slab/rpc_inode_cache/poison 
> 1
> 
> Looking at the code...
> 
> It looks like SLAB will call the ctor on every object when it's
> allocated, even if it was recycled from an existing slab. SLUB doesn't
> do that however -- as best I can tell it avoids poisoning objects when
> there is a ctor function, so they don't get reinitialized like they
> would with SLAB.
> 
> Probably the best solution here is to eliminate the ctor function and
> just initialize the objects whenever they're allocated. Since these
> objects aren't frequently recycled then there's little benefit to
> keeping that around, IMO. I'll spin up a patch for that soon.
> 
> Still, I wonder if there are other problems like this around. The slab
> allocators seem to call debug_check_no_obj_freed() on kmem_cache_free,
> but parts of the objects themselves (like the timer in the work object
> here) get initialized in other places and aren't necessarily
> reinitialized when they're recycled out of the slab...
> 

On second thought...getting rid of the ctor function here might be
problematic. We have to call inode_init_once, etc...

Almost all of the inode slabs have one, so I've settled for just moving
the INIT_DELAYED_WORK call out of init_once and into rpc_alloc_inode. I
sent a patch to Trond and linux-nfs to do that. That will fix this
case, but I do wonder if there are other places in the kernel that have
similar problems with debugobject initialization.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html