Hi David, A couple of comments below. On Thu, Jul 1, 2010 at 1:36 AM, David Howells <dhowells@xxxxxxxxxx> wrote: > Add a pair of system calls to make extended file stats available, including > file creation time, inode version and data version where available through the > underlying filesystem. > > [This depends on the previously posted pair of patches to (a) constify a number > of syscall string and buffer arguments and (b) rearrange AFS's use of > i_version and i_generation]. > > The following structures are defined for their use: > > struct xstat_parameters { > unsigned long long request_mask; Poor name, since it's a value-result arg? Better maybe something like "field_mask"? > }; > > struct xstat_dev { > unsigned int major, minor; > }; > > struct xstat_time { > unsigned long long tv_sec, tv_nsec; > }; > > struct xstat { > unsigned int st_mode; > unsigned int st_nlink; > unsigned int st_uid; > unsigned int st_gid; > struct xstat_dev st_rdev; > struct xstat_dev st_dev; > struct xstat_time st_atime; > struct xstat_time st_mtime; > struct xstat_time st_ctime; > struct xstat_time st_btime; > unsigned long long st_ino; > unsigned long long st_size; > unsigned long long st_blksize; > unsigned long long st_blocks; > unsigned long long st_gen; > unsigned long long st_data_version; > unsigned long long st_result_mask; > unsigned long long st_extra_results[0]; > }; > > where st_btime is the file creation time, st_gen is the inode generation > (i_generation), st_data_version is the data version number (i_version), > request_mask and st_result_mask are bitmasks of data desired/provided and > st_extra_results[] is where as-yet undefined fields are appended. > > The defined bits in request_mask and st_result_mask are: > > XSTAT_REQUEST_MODE Want/got st_mode > XSTAT_REQUEST_NLINK Want/got st_nlink > XSTAT_REQUEST_UID Want/got st_uid > XSTAT_REQUEST_GID Want/got st_gid > XSTAT_REQUEST_RDEV Want/got st_rdev > XSTAT_REQUEST_ATIME Want/got st_atime > XSTAT_REQUEST_MTIME Want/got st_mtime > XSTAT_REQUEST_CTIME Want/got st_ctime > XSTAT_REQUEST_INO Want/got st_ino > XSTAT_REQUEST_SIZE Want/got st_size > XSTAT_REQUEST_BLOCKS Want/got st_blocks > XSTAT_REQUEST__BASIC_STATS The stuff in the normal stat struct > XSTAT_REQUEST_BTIME Want/got st_btime > XSTAT_REQUEST_GEN Want/got st_gen > XSTAT_REQUEST_DATA_VERSION Want/got st_data_version > XSTAT_REQUEST__EXTENDED_STATS The stuff in the xstat struct > XSTAT_REQUEST__ALL_STATS The defined set of requestables > > The system calls are: > > ssize_t ret = xstat(int dfd, > const char *filename, > unsigned flags, > const struct xstat_parameters *params, > struct xstat *buffer, > size_t buflen); > > ssize_t ret = fxstat(unsigned fd, > unsigned flags, > const struct xstat_parameters *params, > struct xstat *buffer, > size_t buflen); > > > The dfd, filename, flags and fd parameters indicate the file to query. There > is no equivalent of lstat() as that can be emulated with xstat() by passing > AT_SYMLINK_NOFOLLOW in flags. > > AT_FORCE_ATTR_SYNC can also be set in flags. This will require a network > filesystem to synchronise its attributes with the server. > > When the system call is executed, the request_mask bitmask is read from the > parameter block to work out what the user is requesting. If params is NULL, > then request_mask will be assumed to be XSTAT_REQUEST__GET_ANYWAY. There is no XSTAT_REQUEST__GET_ANYWAY, AFAICS. I guess here you meant XSTAT_REQUEST__EXTENDED_STATS? Or? > The request_mask should be set by the caller to specify extra results that the > caller may desire. These come in a number of classes: > > (0) dev, blksize. > > These are local data and are always available. > > (1) mode, nlinks, uid, gid, [amc]time, ino, size, blocks. > > These will be returned whether the caller asks for them or not. The > corresponding bits in result_mask will be set to indicate their presence. > > If the caller didn't ask for them, then they may be approximated. For > example, NFS won't waste any time updating them from the server, unless as > a byproduct of updating something requested. > > (2) rdev. > > As for class (1), but this won't be returned if the file is not a blockdev > or chardev. The bit will be cleared if the value is not returned. > > (3) File creation time, inode generation and data version. > > These will be returned if available whether the caller asked for them or > not. The corresponding bits in result_mask will be set or cleared as > appropriate to indicate their presence. > > If the caller didn't ask for them, then they may be approximated. For > example, NFS won't waste any time updating them from the server, unless > as a byproduct of updating something requested. > > (4) Extra results. > > These will only be returned if the caller asked for them by setting their > bits in request_mask. They will be placed in the buffer after the xstat > struct in ascending result_mask bit order. Any bit set in request_mask > mask will be left set in result_mask if the result is available and > cleared otherwise. > > The pointer into the results list will be rounded up to the nearest 8-byte > boundary after each result is written in. The size of each extra result > is specific to the definition for that result. > > No extra results are currently defined. > > If the buffer is insufficiently big, the syscall returns the amount of space it > will need to write the complete result set and returns a partial result in the > buffer. This case is almost certainly a user error, so why not simply return an error (-1 and ERANGE or E2BIG)? The above approach invites userspace errors of the form: if (xtat(...) < 0) { /* How users often check for error */ /* I'll handle the error */ } else { /* The call succeeded; I'm fine */ } Instead, more complex error-handling is required for *every* call: ret = xstat(..., buflen); if (ret < 0 || ret > buflen) /* I'll handle the error */ } else { /* The call succeeded; I'm fine */ } If you are looking for a way to inform the user about the required buffer size, I think it would be better to take a leaf from the getxattr(2) book: if 'buflen' is zero, then do nothing with the output arg, but return the size that would be required. Cheers, Michael -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html