On Mon, Nov 6, 2017 at 10:10 AM, David Howells <dhowells@xxxxxxxxxx> wrote: > Paul Eggert <eggert@xxxxxxxxxxx> wrote: > >> Florian Weimer suggested I write you about a bug in 'stat' and related system >> calls, a bug that will become more important as time goes on. The bug is that >> when running in a 32-bit application, 'stat' reports incorrect time stamps for >> files whose timestamps are so positive (or so negative) that they do not fit >> into 32-bit signed integers. POSIX says that such 'stat' calls should fail >> with errno == EOVERFLOW, and this is common practice elsewhere. However, the >> Linux kernel causes 'stat' to succeed in this situation, discarding the >> high-order bits of the time stamp. >> >> This bug is causing gzip 'make test' to fail when built in 32-bit mode and >> running atop a 64-bit Linux kernel, as part of gzip's test for out-of-range >> file timestamps that are encoded in compressed gzip streams; see: >> >> https://debbugs.gnu.org/bug=25636 >> >> I filed a bug report about 'stat' for Fedora here: >> >> https://bugzilla.redhat.com/1419736 >> >> and Florian suggested that I raise the issue with you. >> >> Is the intent is that all 32-bit Linux apps will migrate to x32 or some other >> model with 64-bit time_t, 32-bit applications will not migrate to x32, but all 32-bit architectures will change to using 64-bit time_t at some point/ >> and that this will happen well before the year 2038 >> so the bug is unimportant? Time stamps past 2038 and before 1970 are already important, because they can be set using 'utimes/futimes/utimensat', and applications do run into problems. E.g. copying a file from a file system with no timestamps (everything is 0 a.k.a 1.1.1970) to another file system with incorrect time zones can give you timestamps on Dec 31, 1969. Copying such a file to another file system that doesn't support pre-1970 timestamps (e.g. AFS) can give you timestamps in 2106, which in turn can get represented in random other ways on file systems with different sets of limitations. I think we want to keep the behavior as sane as possible in all those cases. >> Or is the intent that the 32-bit Linux clock will >> wrap around after 2038 and go negative? Or is the intent something else? I'd >> like to know so that I can fix gzip and other applications and their test >> cases to match the intent. Deepa has proposed a patch series for this in the past, limiting the range of the timestamps to whatever is supported to prevent an overflow. It's probably way in her backlog of other patch series to do first, but it's certainly still on the radar. > The kernel now offers the statx() system call: > > http://man7.org/linux/man-pages/man2/statx.2.html > > But it's not yet supported by glibc. statx doesn't address all of the issues here, since we have cases that are easy to confuse: - setting a out-of-range timestamp with futimensat() on a file system that is more limited than the software, e.g. setting a stamp after 2038 on ext3fs or a stamp before 1970 on AFS. These should be truncated to the minimum/maximum supported time. - Reading a file system timestamp that is out of range to user space, e.g. a time stamp past 2038 on an ext4 file system with long inodes while using 32-bit user space and the stat() syscall - 32-bit user space on a future architecture with 32-bit time_t calling 'stat', which glibc translates into 'statx' because 'stat' is not available - the implementation of statx only handles the user/kernel boundary, inside of 32-bit kernels, 'struct kstat' still uses 32-bit timestamps, so we still get wrong results even when both the file system and user space support 64-bit timestamps. - file systems that represent timestamps outside of the [1970..2038] range different on 32-bit and 64-bit kernels, returning times in the [1902..1969] range on 32-bit architectures but using the [2038..2106] range on 64-bit architectures. We have a few of those still and intentionally not fixed them yet, since we want to change them to using the full [1970..2106] range on both 32-bit and 64-bit architectures in the future, once 64-bit time_t is used throughout the Linux VFS code. Some existing file systems (ext3, xfs, ...) have in the past fixed this problem by adapting the 64-bit kernels to use the [1902..2038] range, which makes it consistent but requires a different change to represent post-2038 timestamps on all architectures. Arnd