Re: why is i_ino unsigned long, anyway?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2 Oct 2013, J. Bruce Fields wrote:
> On Wed, Oct 02, 2013 at 11:47:22AM -0700, Sage Weil wrote:
> > On Wed, 2 Oct 2013, J. Bruce Fields wrote:
> > > On Wed, Oct 02, 2013 at 09:05:27AM -0700, Christoph Hellwig wrote:
> > > > On Wed, Oct 02, 2013 at 10:25:27AM -0400, J. Bruce Fields wrote:
> > > > > If so then it's no huge code duplication to it by hand:
> > > > > 
> > > > > 	if (inode->i_op->getattr)
> > > > > 		inode->i_op->getattr(path->mnt, path->dentry, &stat);
> > > > > 	else
> > > > > 		generic_fillattr(inode, &stat);
> > > > 
> > > > Maybe make that a vfs_getattr_nosec and let vfs_getattr call it?
> > > > 
> > > > Including a proper kerneldoc comment explaining when to use it, please.
> > > 
> > > Something like this?
> > 
> > I'm late to this thread, but: getattr() is a comparatively expensive 
> > operation for just getting an (immutable) ino value on a distributed or 
> > clustered fs.  It would be nice if there were a separate call that didn't 
> > try to populate all the other kstat fields with valid data.  Something 
> > like this was part of the xstat series from forever ago (a bit mask passed 
> > to getattr indicating which fields were needed), so the approach below 
> > might be okay if we think we'll get there sometime soon, but my preference 
> > would be for another fix...
> 
> Understood, and perhaps this code should eventually take advantage of
> xstat, but:
> 
> 	- this code isn't handling the common case, so we're not too
> 	  worried about the performance, and
> 	- you also have the option of defining your own
> 	  export_operations->get_name.  Especially consider that if you
> 	  have some better way to answer the question "what is the name
> 	  of inode X in directory Y" that's better than reading Y
> 	  looking for a matching inode number.

Ah, I see.  Sounds good then!

Thanks-
sage

> 
> --b.
> 
> > (Ceph also uses 64-bit inos.)
> > 
> > Thanks!
> > sage
> > 
> > 
> > > 
> > > --b.
> > > 
> > > commit 8418a41b7192cf2f372ae091207adb29a088f9a0
> > > Author: J. Bruce Fields <bfields@xxxxxxxxxx>
> > > Date:   Tue Sep 10 11:41:12 2013 -0400
> > > 
> > >     exportfs: fix 32-bit nfsd handling of 64-bit inode numbers
> > >     
> > >     Symptoms were spurious -ENOENTs on stat of an NFS filesystem from a
> > >     32-bit NFS server exporting a very large XFS filesystem, when the
> > >     server's cache is cold (so the inodes in question are not in cache).
> > >     
> > >     Reported-by: Trevor Cordes <trevor@xxxxxxxxxxxxx>
> > >     Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxx>
> > > 
> > > diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> > > index 293bc2e..811831a 100644
> > > --- a/fs/exportfs/expfs.c
> > > +++ b/fs/exportfs/expfs.c
> > > @@ -215,7 +215,7 @@ struct getdents_callback {
> > >  	struct dir_context ctx;
> > >  	char *name;		/* name that was found. It already points to a
> > >  				   buffer NAME_MAX+1 is size */
> > > -	unsigned long ino;	/* the inum we are looking for */
> > > +	u64 ino;		/* the inum we are looking for */
> > >  	int found;		/* inode matched? */
> > >  	int sequence;		/* sequence counter */
> > >  };
> > > @@ -255,10 +255,10 @@ static int get_name(const struct path *path, char *name, struct dentry *child)
> > >  	struct inode *dir = path->dentry->d_inode;
> > >  	int error;
> > >  	struct file *file;
> > > +	struct kstat stat;
> > >  	struct getdents_callback buffer = {
> > >  		.ctx.actor = filldir_one,
> > >  		.name = name,
> > > -		.ino = child->d_inode->i_ino
> > >  	};
> > >  
> > >  	error = -ENOTDIR;
> > > @@ -268,6 +268,16 @@ static int get_name(const struct path *path, char *name, struct dentry *child)
> > >  	if (!dir->i_fop)
> > >  		goto out;
> > >  	/*
> > > +	 * inode->i_ino is unsigned long, kstat->ino is u64, so the
> > > +	 * former would be insufficient on 32-bit hosts when the
> > > +	 * filesystem supports 64-bit inode numbers.  So we need to
> > > +	 * actually call ->getattr, not just read i_ino:
> > > +	 */
> > > +	error = vfs_getattr_nosec(path, &stat);
> > > +	if (error)
> > > +		return error;
> > > +	buffer.ino = stat.ino;
> > > +	/*
> > >  	 * Open the directory ...
> > >  	 */
> > >  	file = dentry_open(path, O_RDONLY, cred);
> > > diff --git a/fs/stat.c b/fs/stat.c
> > > index 04ce1ac..71a39e8 100644
> > > --- a/fs/stat.c
> > > +++ b/fs/stat.c
> > > @@ -37,14 +37,21 @@ void generic_fillattr(struct inode *inode, struct kstat *stat)
> > >  
> > >  EXPORT_SYMBOL(generic_fillattr);
> > >  
> > > -int vfs_getattr(struct path *path, struct kstat *stat)
> > > +/**
> > > + * vfs_getattr_nosec - getattr without security checks
> > > + * @path: file to get attributes from
> > > + * @stat: structure to return attributes in
> > > + *
> > > + * Get attributes without calling security_inode_getattr.
> > > + *
> > > + * Currently the only caller other than vfs_getattr is internal to the
> > > + * filehandle lookup code, which uses only the inode number and returns
> > > + * no attributes to any user.  Any other code probably wants
> > > + * vfs_getattr.
> > > + */
> > > +int vfs_getattr_nosec(struct path *path, struct kstat *stat)
> > >  {
> > >  	struct inode *inode = path->dentry->d_inode;
> > > -	int retval;
> > > -
> > > -	retval = security_inode_getattr(path->mnt, path->dentry);
> > > -	if (retval)
> > > -		return retval;
> > >  
> > >  	if (inode->i_op->getattr)
> > >  		return inode->i_op->getattr(path->mnt, path->dentry, stat);
> > > @@ -53,6 +60,18 @@ int vfs_getattr(struct path *path, struct kstat *stat)
> > >  	return 0;
> > >  }
> > >  
> > > +EXPORT_SYMBOL_GPL(vfs_getattr_nosec);
> > > +
> > > +int vfs_getattr(struct path *path, struct kstat *stat)
> > > +{
> > > +	int retval;
> > > +
> > > +	retval = security_inode_getattr(path->mnt, path->dentry);
> > > +	if (retval)
> > > +		return retval;
> > > +	return vfs_getattr_nosec(path, stat);
> > > +}
> > > +
> > >  EXPORT_SYMBOL(vfs_getattr);
> > >  
> > >  int vfs_fstat(unsigned int fd, struct kstat *stat)
> > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > index 9818747..5a51faa 100644
> > > --- a/include/linux/fs.h
> > > +++ b/include/linux/fs.h
> > > @@ -2500,6 +2500,7 @@ extern int page_symlink(struct inode *inode, const char *symname, int len);
> > >  extern const struct inode_operations page_symlink_inode_operations;
> > >  extern int generic_readlink(struct dentry *, char __user *, int);
> > >  extern void generic_fillattr(struct inode *, struct kstat *);
> > > +int vfs_getattr_nosec(struct path *path, struct kstat *stat);
> > >  extern int vfs_getattr(struct path *, struct kstat *);
> > >  void __inode_add_bytes(struct inode *inode, loff_t bytes);
> > >  void inode_add_bytes(struct inode *inode, loff_t bytes);
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > 
> > > 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux