Re: 32-bit stat returns wrong st_mtime if file timestamp does not fit in 32 bits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 6, 2017 at 10:10 AM, David Howells <dhowells@xxxxxxxxxx> wrote:
> Paul Eggert <eggert@xxxxxxxxxxx> wrote:
>
>> Florian Weimer suggested I write you about a bug in 'stat' and related system
>> calls, a bug that will become more important as time goes on. The bug is that
>> when running in a 32-bit application, 'stat' reports incorrect time stamps for
>> files whose timestamps are so positive (or so negative) that they do not fit
>> into 32-bit signed integers. POSIX says that such 'stat' calls should fail
>> with errno == EOVERFLOW, and this is common practice elsewhere. However, the
>> Linux kernel causes 'stat' to succeed in this situation, discarding the
>> high-order bits of the time stamp.
>>
>> This bug is causing gzip 'make test' to fail when built in 32-bit mode and
>> running atop a 64-bit Linux kernel, as part of gzip's test for out-of-range
>> file timestamps that are encoded in compressed gzip streams; see:
>>
>> https://debbugs.gnu.org/bug=25636
>>
>> I filed a bug report about 'stat' for Fedora here:
>>
>> https://bugzilla.redhat.com/1419736
>>
>> and Florian suggested that I raise the issue with you.
>>
>> Is the intent is that all 32-bit Linux apps will migrate to x32 or some other
>> model with 64-bit time_t,

32-bit applications will not migrate to x32, but all 32-bit architectures will
change to using 64-bit time_t at some point/

>> and that this will happen well before the year 2038
>> so the bug is unimportant?

Time stamps past 2038 and before 1970 are already important, because
they can be set using 'utimes/futimes/utimensat', and applications
do run into problems. E.g. copying a file from a file system with
no timestamps (everything is 0 a.k.a 1.1.1970) to another file system
with incorrect time zones can give you timestamps on Dec 31, 1969.
Copying such a file to another file system that doesn't support pre-1970
timestamps (e.g. AFS) can give you timestamps in 2106, which in
turn can get represented in random other ways on file systems with
different sets of limitations.

I think we want to keep the behavior as sane as possible in all those
cases.

>> Or is the intent that the 32-bit Linux clock will
>> wrap around after 2038 and go negative? Or is the intent something else? I'd
>> like to know so that I can fix gzip and other applications and their test
>> cases to match the intent.

Deepa has proposed a patch series for this in the past, limiting
the range of the timestamps to whatever is supported to prevent
an overflow.

It's probably way in her backlog of other patch series to do first, but
it's certainly still on the radar.

> The kernel now offers the statx() system call:
>
>         http://man7.org/linux/man-pages/man2/statx.2.html
>
> But it's not yet supported by glibc.

statx doesn't address all of the issues here, since we have cases that
are easy to confuse:

- setting a out-of-range timestamp with futimensat() on a file system that
  is more limited than the software, e.g. setting a stamp after 2038 on ext3fs
  or a stamp before 1970 on AFS. These should be truncated to the
  minimum/maximum supported time.

-  Reading a file system timestamp that is out of range to user space,
   e.g. a time stamp past 2038 on an ext4 file system with long inodes
   while using 32-bit user space and the stat() syscall

- 32-bit user space on a future architecture with 32-bit time_t calling
  'stat', which glibc translates into 'statx' because 'stat' is not available

- the implementation of statx only handles the user/kernel
  boundary, inside of 32-bit kernels,  'struct kstat' still uses 32-bit
  timestamps, so we still get wrong results even when both the file
  system and user space support 64-bit timestamps.

- file systems that represent timestamps outside of the [1970..2038]
  range different on 32-bit and 64-bit kernels, returning times in the
  [1902..1969] range on 32-bit architectures but using the [2038..2106]
  range on 64-bit architectures. We have a few of those still and
  intentionally not fixed them yet, since we want to change them to
  using the full [1970..2106] range on both 32-bit and 64-bit architectures
  in the future, once 64-bit time_t is used throughout the Linux VFS
  code.
  Some existing file systems (ext3, xfs, ...) have in the past fixed
  this problem by adapting the 64-bit kernels to use the [1902..2038]
  range, which makes it consistent but requires a different change to
  represent post-2038 timestamps on all architectures.

         Arnd



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux