On 2010-06-29, at 19:16, David Howells wrote: > Implement a pair of new system calls to provide extended and further extensible stat functions. > > The third of the associated patches provides these new system calls: > > struct xstat_dev { > unsigned int major; > unsigned int minor; > }; Doesn't glibc use two 64-bit values for devices? > struct xstat { > unsigned int struct_version; > #define XSTAT_STRUCT_VERSION 0 I dislike sequential "version" fields (which are "all or nothing"), and prefer the ext2/3/4-like "feature flags" that allow the caller to state what features and fields it expects and/or understands. This allows extensibility without unduly breaking compatibility. > unsigned int st_mode; Having a separate MODE flag would be great for "ls --color", since that is basically the only information that it needs that isn't already available in the readdir() output. > unsigned int st_nlink; > unsigned int st_uid; > unsigned int st_gid; In struct stat64 it uses "unsigned long" for both st_uid and st_gid. Having a 64-bit value here is useful for CIFS servers to be able to remap different UID domains into a 32-bit domain and a 32-bit UID. If you change this, please remember to reorder the fields for proper 64-bit alignment. > unsigned int st_blksize; > Does st_blksize really need to be 64 bits on a 64-bit system? I don't think so, but adding a 32-bit padding couldn't hurt. > unsigned long long st_ino; > unsigned long long st_size; > Should the inode number and data version number fields be 128-bit? I wouldn't object to having a 128-bit st_ino field, since this is what Lustre will be using internally in the next release. Similarly, _filesystems_ are not SO far from hitting the 64-bit size limit (a Lustre filesystem will likely hit 100PB ~= 2^57 bytes in the next year), so having a 128-bit st_size wouldn't be unreasonable, because... What is also very convenient that I learned Solaris stat() does is it returns the device size in st_size for a block device file. This is very convenient, and avoids the morass of ioctls and "binary llseek guessing" used by libext2fs and libblkid to determine the size of a block device. Any reason not to add this into this new syscall? > unsigned long long st_blocks; If st_size is 128-bit (or has padding) then st_blocks should have the same. > unsigned long long query_flags; It is inconsistent to have all the other fields use the "st_" prefix, but "query_flags" and "struct_version" do not have this prefix. > #define XSTAT_QUERY_AMC_TIMES 0x00000004ULL Can these be split into separate ATIME, MTIME, CTIME flags? > #define XSTAT_QUERY_CREATION_TIME 0x00000008ULL It seems a bit inconsistent to call the field "st_btime" and the mask "CREATION_TIME". It would be more consistent (if somewhat less clear) to call the mask "BTIME". The struct definition should get short comments for each field to explain their meaning anyway, so "st_btime" can have "/* birth/creation time */". > #define XSTAT_QUERY_INODE_GENERATION 0x00000020ULL This is also a bit inconsistent with the "st_gen" field name. > #define XSTAT_QUERY_DATA_VERSION 0x00000040ULL It wouldn't be a bad idea to interleave these flags with each of the fields that they represent, to make it more clear what is included in each. > #define XSTAT_QUERY__ORDINARY_SET 0x00000017ULL > #define XSTAT_QUERY__GET_ANYWAY 0x0000007fULL Could you provide some information what the semantic distinction between these is? It might be useful to have an "XSTAT_QUERY_LEGACY_STAT" mask that returns only the fields that are in the previous struct stat, unless that is what "ORDINARY_SET" means, in which case it should be renamed I think. > #define XSTAT_QUERY__DEFINED_SET 0x0000007fULL It is smart to have a "DEFINED_SET" mask that maps to the currently-understood fields. This ensures that applications compiled against a specific set of headers/struct will not request fields which they don't understand. It might be better to call this "XSTAT_QUERY_ALL" so that it is more easily understood and used by callers, instead of the incorrect "-1" or "~0" that some may be tempted to use if they don't understand what "__DEFINED_SET" means. Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html