On Apr 07, 2009 01:00 -0700, Mark Fasheh wrote: > This very, very rough patch set adds three flags to fstatat - AT_NO_SIZE, > AT_NO_TIMES, and AT_STRICT. It seems you and Oleg have two patches crossing in the night... > The first two flags (AT_NO_SIZE, AT_NO_TIMES) allow userspace to notify the > file system layer that certain stat fields are not required to be accurate. The problem with "AT_NO_*" is that old applications which couldn't possibly know about or use a new stat field couldn't possibly know not to ask for it. Instead, as was proposed in the last generation of this thread, there should be AT_GET_{ATIME,MTIME,CTIME,BLOCKS,SIZE,NLINKS,...}, to return the flags that the application actually wants. If none of them are specified, then the current behaviour of "get all attributes" is kept. > Some file systems want this information in order to optimize away expensive > operations associated with stat. In particular, NFS can avoid some syncing > to the server (if userspace doesn't want atime, ctime or mtime) and Lustre > can avoid some expensive locking by avoiding an update of various size > fields (st_size, st_blocks). Actually, despite what was said today, Lustre doesn't revoke the writing client's locks when getting the file size, unlike block-based filesystems. That said, it is still relatively a lot of work to query the size for a widely-striped file since there may be a hundred servers holding the file data and any one of them might have the end-of-file. > AT_STRICT allows userspace to indicate that it wants the most up to date > version of a files status, regardless of performance impact. A distributed > file system which has a non-coherent inode cache would know then to send a > direct query to it's server. The issue with AT_STRICT is that applications TODAY are expecting the proper file status. I'd be more inclined to have an AT_LAZY flag for applications that do NOT require the most up-to-date information. > As noted previously, these patches are really rough. Mostly I'd like to get > some feedback on the interface and general direction of implementation. Some > glaring issues which we want to resolve: > > - This patch set doesn't actually wire up any client file systems yet :) > > - There's the question of whether we wire up [fl]stat(2) variants instead of > using fstatat. I went with the former as the implementation was more > straight forward. It appears that fstatat(2) fits the need for "statlite" cleanly. I'd thought today that this would require opening each file, but it is only necessary to open the parent directory and call fstatat() for each filename in the directory. > - I'm not sure whether we want to force zeroing of the optional fields, or > return whatever's in the inode (which may be stale, or just junk). I think yes, since there have been security bugs filed in rare cases when other bits of kernel data are exposed to user space, even if just a byte or two. > -int vfs_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat) > +int vfs_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat, > + int flags) > { > if (inode->i_op->getattr) > - return inode->i_op->getattr(mnt, dentry, stat); > + return inode->i_op->getattr(mnt, dentry, stat, attr_flags); Oleg's version added a second ->getattr_lite() call that passed the flags, which avoids changing all of the filesystems, though I guess it isn't a huge deal either way. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html