Re: WTF is ceph_lookup_open() doing with nd->intent.open.file?

Sage Weil <sage@xxxxxxxxxxxx> · Sun, 26 Jun 2011 22:24:59 -0700 (PDT)

On Sun, 26 Jun 2011, Al Viro wrote:
> ceph_lookup_open() does the following:
> 
>         struct file *file = nd->intent.open.file;
>         struct inode *parent_inode = get_dentry_parent_inode(file->f_dentry);
> 
> Note that at this point nd->intent.open.file is going to have NULL ->f_dentry.
> What's more, we end up calling ceph_init_file() on that struct file.  If
> open(2) fails *after* the call of that sucker, we'll end up leaking
> from ceph_file_cachep, since ->release() will *not* be called - VFS will
> have no damn indication that it needs to.  Not that calling ->i_fop->open()
> on something without ->f_op (and ->f_dentry, and...) would be a good idea...
> 
> What is that code supposed to do, anyway?  Looks like a bastardized
> variant of the atomic open tricks NFS is pulling off, without the
> proper use of lookup_instantiate_filp()...  The thing is,
> lookup_instantiate_filp() takes care to set ->f_path.dentry, which is
> what distinguishes struct file that had been through ->open() from
> ones that had not.  So no ->release() for you...
> 
> Moreover, what would you expect to set ->f_dentry by the time you call
> ->lookup()?  Looks like you expect that parent_inode to be the directory
> you are doing lookup in, so why not use the dir argument of ceph_lookup_open()?
> While we are at it, what's "locked_dir" and what is it for?  AFAICS,
> nothing has ever looked at it - not since the mainline merge...
> 
> Either I'm seriously confused, or that code is...

Yeah, it's the code.  This is all pre- ceph merge, so my memory is hazy, 
but as I recall the open intents worked, but never quite as I expected.  
Since I'm pretty sure this is the first time I've looked at 
lookup_instantiate_filp(), that's not surprising.

The f_dentry bit looks like it got copy/pasted from ceph_open way back 
when.  In the end it means we just pass a NULL value to 
ceph_mdsc_do_request(), but that's actually fine (and correct in this 
case).  I'll clean it up in the next window.  

As for lookup_instantiate_filp(), I have a patch that fixes this up and 
seems to work but want to spend some more time playing with it to make 
sure it's really doing the right thing here.  Since the only problem with 
the current behavior is a leaked filp in the error case (right?) that can 
wait for the next window too.

Thanks for the heads-up!
sage
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html