On Mon, Jun 4, 2018 at 5:01 PM, David Howells <dhowells@xxxxxxxxxx> wrote: > Arnd Bergmann <arnd@xxxxxxxx> wrote: > >> ntfs has sb->s_time_gran=100, and vfat should really have >> sb->s_time_gran=2000000000 but that doesn't seem to be set right >> at the moment. > > (V)FAT actually has a different granularity on each timestamp. Ah, right. I guess I missed the fact that the file system can override it, so FAT doesn't actually have to do it this way. >> > +struct fsinfo_capabilities { >> > + __u64 supported_stx_attributes; /* What statx::stx_attributes are supported */ >> > + __u32 supported_stx_mask; /* What statx::stx_mask bits are supported */ >> > + __u32 supported_ioc_flags; /* What FS_IOC_* flags are supported */ >> > + __u8 capabilities[(fsinfo_cap__nr + 7) / 8]; >> > +}; >> >> This looks a bit odd: with the 44 capabilities, you end up having a six-byte >> array followed by two bytes of implicit padding. If the number of >> capabilities grows beyond 64, you have a nine byte array with more padding >> to the next alignof(__u64). Is that intentional? > > I've split the capabilities out into their own thing. I've attached the > revised patch below. I'm still not completely clear on how variable-length structures are supposed to be handled by the fsinfo syscall. It seems like a possible source of bugs to return a structure from the kernel that has a different size in kernel and user space depending on the fsinfo_cap__nr value at compile time. How does one e.g. guarantee there is no out of bounds access when you run new user space on an older kernel that has a smaller structure? >> > +struct fsinfo_timestamp_info { >> > + __s64 minimum_timestamp; /* Minimum timestamp value in seconds */ >> > + __s64 maximum_timestamp; /* Maximum timestamp value in seconds */ >> > + __u16 atime_gran_mantissa; /* Granularity(secs) = mant * 10^exp */ >> > + __u16 btime_gran_mantissa; >> > + __u16 ctime_gran_mantissa; >> > + __u16 mtime_gran_mantissa; >> > + __s8 atime_gran_exponent; >> > + __s8 btime_gran_exponent; >> > + __s8 ctime_gran_exponent; >> > + __s8 mtime_gran_exponent; >> > +}; >> >> This structure has a slightly inconsistent amount of padding at the end: >> on x86-32 it has no padding, everywhere else it has 32 bits of padding >> to make it 64-bit aligned. Maybe add a __u32 reserved field? > > It occurs to me that the min and max may be different for each timestamp. > Maybe I should have: > > struct fsinfo_timestamp_info { > char name[7 + 1]; > __s64 minimum_timestamp; > __s64 maximum_timestamp; > __u16 granularity_mantissa; > __s8 granularity_exponent; > __u8 __reserved[5]; > }; > I don't particularly like having a string in there, that seems to add unnecessary complexity compared to using an integer. Having four min/max values would make it more generic but I don't think we have a need for that at the moment with any of the file systems we support. In any case, it would be nice to have a trivial way to query which of the four timestamp types are supported at all, and returning them separately would be one way of doing that. > and then you iterate through them by setting Nth. I could just put a: > > __u64 granularity; > > field, expressed in nS, rather than mantissa and exponent, but doing it this > way allows me to express granularities less that 1nS very simply (something > Dave Chinner was talking about). > > It might also be worth putting minimum_timestamp and maximum_timestamp in > terms of granularity rather than nS. Expressing the granularity in nanoseconds (or something smaller) would seem more natural to me, but I don't really care much either way. Arnd