Re: why is i_ino unsigned long, anyway?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 02, 2013 at 11:47:22AM -0700, Sage Weil wrote:
> On Wed, 2 Oct 2013, J. Bruce Fields wrote:
> > On Wed, Oct 02, 2013 at 09:05:27AM -0700, Christoph Hellwig wrote:
> > > On Wed, Oct 02, 2013 at 10:25:27AM -0400, J. Bruce Fields wrote:
> > > > If so then it's no huge code duplication to it by hand:
> > > > 
> > > > 	if (inode->i_op->getattr)
> > > > 		inode->i_op->getattr(path->mnt, path->dentry, &stat);
> > > > 	else
> > > > 		generic_fillattr(inode, &stat);
> > > 
> > > Maybe make that a vfs_getattr_nosec and let vfs_getattr call it?
> > > 
> > > Including a proper kerneldoc comment explaining when to use it, please.
> > 
> > Something like this?
> 
> I'm late to this thread, but: getattr() is a comparatively expensive 
> operation for just getting an (immutable) ino value on a distributed or 
> clustered fs.  It would be nice if there were a separate call that didn't 
> try to populate all the other kstat fields with valid data.  Something 
> like this was part of the xstat series from forever ago (a bit mask passed 
> to getattr indicating which fields were needed), so the approach below 
> might be okay if we think we'll get there sometime soon, but my preference 
> would be for another fix...

Understood, and perhaps this code should eventually take advantage of
xstat, but:

	- this code isn't handling the common case, so we're not too
	  worried about the performance, and
	- you also have the option of defining your own
	  export_operations->get_name.  Especially consider that if you
	  have some better way to answer the question "what is the name
	  of inode X in directory Y" that's better than reading Y
	  looking for a matching inode number.

--b.

> (Ceph also uses 64-bit inos.)
> 
> Thanks!
> sage
> 
> 
> > 
> > --b.
> > 
> > commit 8418a41b7192cf2f372ae091207adb29a088f9a0
> > Author: J. Bruce Fields <bfields@xxxxxxxxxx>
> > Date:   Tue Sep 10 11:41:12 2013 -0400
> > 
> >     exportfs: fix 32-bit nfsd handling of 64-bit inode numbers
> >     
> >     Symptoms were spurious -ENOENTs on stat of an NFS filesystem from a
> >     32-bit NFS server exporting a very large XFS filesystem, when the
> >     server's cache is cold (so the inodes in question are not in cache).
> >     
> >     Reported-by: Trevor Cordes <trevor@xxxxxxxxxxxxx>
> >     Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxx>
> > 
> > diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> > index 293bc2e..811831a 100644
> > --- a/fs/exportfs/expfs.c
> > +++ b/fs/exportfs/expfs.c
> > @@ -215,7 +215,7 @@ struct getdents_callback {
> >  	struct dir_context ctx;
> >  	char *name;		/* name that was found. It already points to a
> >  				   buffer NAME_MAX+1 is size */
> > -	unsigned long ino;	/* the inum we are looking for */
> > +	u64 ino;		/* the inum we are looking for */
> >  	int found;		/* inode matched? */
> >  	int sequence;		/* sequence counter */
> >  };
> > @@ -255,10 +255,10 @@ static int get_name(const struct path *path, char *name, struct dentry *child)
> >  	struct inode *dir = path->dentry->d_inode;
> >  	int error;
> >  	struct file *file;
> > +	struct kstat stat;
> >  	struct getdents_callback buffer = {
> >  		.ctx.actor = filldir_one,
> >  		.name = name,
> > -		.ino = child->d_inode->i_ino
> >  	};
> >  
> >  	error = -ENOTDIR;
> > @@ -268,6 +268,16 @@ static int get_name(const struct path *path, char *name, struct dentry *child)
> >  	if (!dir->i_fop)
> >  		goto out;
> >  	/*
> > +	 * inode->i_ino is unsigned long, kstat->ino is u64, so the
> > +	 * former would be insufficient on 32-bit hosts when the
> > +	 * filesystem supports 64-bit inode numbers.  So we need to
> > +	 * actually call ->getattr, not just read i_ino:
> > +	 */
> > +	error = vfs_getattr_nosec(path, &stat);
> > +	if (error)
> > +		return error;
> > +	buffer.ino = stat.ino;
> > +	/*
> >  	 * Open the directory ...
> >  	 */
> >  	file = dentry_open(path, O_RDONLY, cred);
> > diff --git a/fs/stat.c b/fs/stat.c
> > index 04ce1ac..71a39e8 100644
> > --- a/fs/stat.c
> > +++ b/fs/stat.c
> > @@ -37,14 +37,21 @@ void generic_fillattr(struct inode *inode, struct kstat *stat)
> >  
> >  EXPORT_SYMBOL(generic_fillattr);
> >  
> > -int vfs_getattr(struct path *path, struct kstat *stat)
> > +/**
> > + * vfs_getattr_nosec - getattr without security checks
> > + * @path: file to get attributes from
> > + * @stat: structure to return attributes in
> > + *
> > + * Get attributes without calling security_inode_getattr.
> > + *
> > + * Currently the only caller other than vfs_getattr is internal to the
> > + * filehandle lookup code, which uses only the inode number and returns
> > + * no attributes to any user.  Any other code probably wants
> > + * vfs_getattr.
> > + */
> > +int vfs_getattr_nosec(struct path *path, struct kstat *stat)
> >  {
> >  	struct inode *inode = path->dentry->d_inode;
> > -	int retval;
> > -
> > -	retval = security_inode_getattr(path->mnt, path->dentry);
> > -	if (retval)
> > -		return retval;
> >  
> >  	if (inode->i_op->getattr)
> >  		return inode->i_op->getattr(path->mnt, path->dentry, stat);
> > @@ -53,6 +60,18 @@ int vfs_getattr(struct path *path, struct kstat *stat)
> >  	return 0;
> >  }
> >  
> > +EXPORT_SYMBOL_GPL(vfs_getattr_nosec);
> > +
> > +int vfs_getattr(struct path *path, struct kstat *stat)
> > +{
> > +	int retval;
> > +
> > +	retval = security_inode_getattr(path->mnt, path->dentry);
> > +	if (retval)
> > +		return retval;
> > +	return vfs_getattr_nosec(path, stat);
> > +}
> > +
> >  EXPORT_SYMBOL(vfs_getattr);
> >  
> >  int vfs_fstat(unsigned int fd, struct kstat *stat)
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 9818747..5a51faa 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -2500,6 +2500,7 @@ extern int page_symlink(struct inode *inode, const char *symname, int len);
> >  extern const struct inode_operations page_symlink_inode_operations;
> >  extern int generic_readlink(struct dentry *, char __user *, int);
> >  extern void generic_fillattr(struct inode *, struct kstat *);
> > +int vfs_getattr_nosec(struct path *path, struct kstat *stat);
> >  extern int vfs_getattr(struct path *, struct kstat *);
> >  void __inode_add_bytes(struct inode *inode, loff_t bytes);
> >  void inode_add_bytes(struct inode *inode, loff_t bytes);
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux