On Sun, 2024-05-26 at 11:31 -0400, Theodore Ts'o wrote: > On Sun, May 26, 2024 at 08:20:16AM -0400, Jeff Layton wrote: > > > > Switch the __i_ctime fields to a single ktime_t. Move the i_generation > > down above i_fsnotify_mask and then move the i_version into the > > resulting 8 byte hole. This shrinks struct inode by 8 bytes total, and > > should improve the cache footprint as the i_version and __i_ctime are > > usually updated together. > > So first of all, this patch is a bit confusing because the patch > doens't change __i_ctime, but rather i_ctime_sec and i_ctime_nsec, and > Linus's tree doesn't have those fields. That's because the base > commit in the patch, a6f48ee9b741, isn't in Linus's tree, and > apparently this patch is dependent on "fs: switch timespec64 fields in > inode to discrete integers"[1]. > > [1] https://lore.kernel.org/all/20240517-amtime-v1-1-7b804ca4be8f@xxxxxxxxxx/ > My bad. I should have mentioned that in the changelog, and I'll clean up the title. > > The one downside I can see to switching to a ktime_t is that if someone > > has a filesystem with files on it that has ctimes outside the ktime_t > > range (before ~1678 AD or after ~2262 AD), we won't be able to display > > them properly in stat() without some special treatment. I'm operating > > under the assumption that this is not a practical problem. > > There are two additional possible problems with this. The first is > that if there is a maliciously fuzzed file system with timestamp > outside of ctimes outside of the ktime_t range, this will almost > certainly trigger UBSAN warnings, which will result in Syzkaller > security bugs reported to file system developers. This can be fixed > by explicitly and clamping ranges whenever converting to ktime_t in > include/linux/fs.h, but that leads to another problem. > There should be no undefined behavior since this is using ktime_set. > The second problem is if the file system converts their on-disk inode > to the in-memory struct inode, and then converts it from the in-memory > to the on-disk inode format, and the timestamp is outside of the > ktime_t range, this could result the on-disk inode having its ctime > field getting corrupted. Now, *most* of the time, whenver the inode > needs to be written back to disk, the ctime field will be changed > anyway. However, if there are changes that don't result > userspace-visible changes, but involves internal file system changes > (for example, in case of an online repair or defrag, or a COW snap), > where the file system doesn't set ctime --- and in it's possible that > this might result in the ctime field messed. > > > We could argue that ctime fields outside of the ktime_t range should > never, ever happen (except for malicious fuzz testing by systems like > syzkaller), and so it could be argued that we don't care(tm), but it's > worth at least a mention in the comments and commit description. Of > course, a file system which *did* care could work around the problem > by having their own copy of ctime in the file-specific inode, but this > would come at the cost of space saving benefits of this commit. > My assumption was that we'd not overwrite an on-disk inode unless we were going to stamp the ctime as well. That's not 100% true (as you point out). I guess we have to decide whether we care about preserving the results with this sort of intentional fuzz testing. > > I'd suspect what I'd do is to add a manual check for an out of range > ctime on-disk, log a warning, and then clamp the ctime to the maximum > ktime_t value, which is what would be returned by stat(2), and then > write that clamped value back to disk if the ctime field doesn't get > set to the current time before it needs to be written back to disk. > ktime_set does this: if (unlikely(secs >= KTIME_SEC_MAX)) return KTIME_MAX; So, this should already clamp the value to KTIME_MAX, at least as long as the seconds value is normalized when called. We certainly could have this do a pr_warn or pr_notice too if the value comes back as KTIME_MAX. Thanks for taking a look, Ted! -- Jeff Layton <jlayton@xxxxxxxxxx>