Re: Oops in rpc_clnt_debugfs_register() from debugfs change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 12, 2019 at 03:37:20PM +0100, Greg Kroah-Hartman wrote:
> On Tue, Feb 12, 2019 at 02:31:14PM +0000, David Howells wrote:
> > I've bisected an oops that occurs in rpc_clnt_debugfs_register() trying to
> > dereference a pointer with -EACCES in it.  This is the causing commit, though
> > I suspect the bug is in sunrpc expecting to see NULL rather than an error.
> > 
> > ff9fb72bc07705c00795ca48631f7fffe24d2c6b is the first bad commit
> > commit ff9fb72bc07705c00795ca48631f7fffe24d2c6b
> > Author: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > Date:   Wed Jan 23 11:28:14 2019 +0100
> > 
> >     debugfs: return error values, not NULL
> >     
> >     When an error happens, debugfs should return an error pointer value, not
> >     NULL.  This will prevent the totally theoretical error where a debugfs
> >     call fails due to lack of memory, returning NULL, and that dentry value
> >     is then passed to another debugfs call, which would end up succeeding,
> >     creating a file at the root of the debugfs tree, but would then be
> >     impossible to remove (because you can not remove the directory NULL).
> >     
> >     So, to make everyone happy, always return errors, this makes the users
> >     of debugfs much simpler (they do not have to ever check the return
> >     value), and everyone can rest easy.
> >     ...
> > 
> > The attached oops occurs during boot from the gssproxy process in
> > rpc_clnt_debugfs_register().  The code at this point is:
> > 
> >    0xffffffff8195cbdd <+450>:   mov    0x50(%rax),%rcx   <--- oopsing
> >    0xffffffff8195cbe1 <+454>:   mov    $0xffffffff821cc8ba,%rdx
> >    0xffffffff8195cbe8 <+461>:   mov    $0x18,%esi
> >    0xffffffff8195cbed <+466>:   lea    -0x30(%rbp),%rdi
> >    0xffffffff8195cbf1 <+470>:   callq  0xffffffff819db773 <snprintf>
> > 
> > RAX is -EACCES.
> > 
> > Looking in the source:
> > 
> > 	len = snprintf(name, sizeof(name), "../../rpc_xprt/%s",
> > 			xprt->debugfs->d_name.name);
> > 
> > I think xprt->debugfs is the value in RAX.
> > 
> > 	(gdb) p &((struct dentry *)0)->d_name.name
> > 	$5 = (const unsigned char **) 0x50 <irq_stack_union+80>
> > 
> > which matches the offset on the oopsing MOV instruction.
> > 
> > This is with linus/master (aa0c38cf39de73bf7360a3da8f1707601261e518).
> 
> Ugh, yeah, I see the problem, sorry about that.
> 
> I wonder why the debugfs call is always failing, that's not good...
> 
> let me dig and see if I already have a patch for this...

I have a much larger cleanup patch for this code, but this single line
change should solve the issue for now.  Can you test it to verify?

thanks,

greg k-h

------------------

diff --git a/net/sunrpc/debugfs.c b/net/sunrpc/debugfs.c
index 45a033329cd4..19bb356230ed 100644
--- a/net/sunrpc/debugfs.c
+++ b/net/sunrpc/debugfs.c
@@ -146,7 +146,7 @@ rpc_clnt_debugfs_register(struct rpc_clnt *clnt)
 	rcu_read_lock();
 	xprt = rcu_dereference(clnt->cl_xprt);
 	/* no "debugfs" dentry? Don't bother with the symlink. */
-	if (!xprt->debugfs) {
+	if (IS_ERR_OR_NULL(xprt->debugfs)) {
 		rcu_read_unlock();
 		return;
 	}



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux