On Fri, Jun 24, 2011 at 15:46, Andreas Dilger wrote: > On 2011-06-23, at 4:37 PM, Mark Harris wrote: >> 2011/6/23 Akira Fujita <a-fujita@xxxxxxxxxxxxx>: >>>> https://bugzilla.kernel.org/show_bug.cgi?id=23732 >>> >>> ext4: Fix ext4 timestamps corruption >>> >>> Officially, ext4 can handle its timestamps until 2514 >>> with 32bit entries plus EPOCH_BIT (2bits). >>> But when timestamps values use 32+ bit >>> (e.g. 2038-01-19 9:14:08 0x0000000080000000), >>> we can get corrupted values. >>> Because sign bit is overwritten by transferring value >>> between kernel space and user space. >>> >>> To fix this issue, 32th bit of extra time fields in ext4_inode structure >>> (e.g. i_ctime_extra) are used as the sign for 64bit user space. >>> Because these are used only 20bits for nano-second and bottom of 2bits >>> are for EXT4_EPOCH_BITS shift. >>> With this patch, ext4 supports timestamps Y1901-2514. >> >> Thanks for looking into this bug. However tv_nsec is in the >> range 0..999999999 and requires 30 bits. That is why tv_sec was >> only extended by 2 bits. So there are no additional spare bits >> in the "extra" field. >> >> 34-bit seconds can accommodate a maximum of 544.4 years, e.g. >> 1970..2514 or 1901..2446. Although an early version of the patch >> for 34-bit tv_sec in ext4 worked with years 1970..2514, prior to >> being committed the patch was changed to support pre-1970 >> timestamps (introducing the sign extension issue in the decoding): >> http://marc.info/?l=linux-ext4&m=118208541320999 > > Sigh, it seems so unlikely that anyone even has a valid timestamp > on a file with a date before 1970, it makes me wonder if this extra > effort is even worthwhile... > >> The existing encoding simply encodes bits 31..0 of tv_sec in the >> regular time field and bits 33..32 in the extra field (along with >> the 30-bit tv_nsec). The issue is in the decoding, which I think >> can be addressed by changing only the body of the "if" in in the >> ext4_decode_extra_time function, to something like this: >> >> time->tv_sec += ((__u32)time->tv_sec + >> ((__u64)le32_to_cpu(extra) << 32) + >> 0x80000000LL) & 0x300000000LL; >> >> This is untested, and might look nicer with some macros, but >> it should decode the 34 bits into a timestamp in the range >> -0x80000000 (1901-12-13) to 0x37fffffff (2446-05-10), while >> retaining compatibility with the existing encoding. >> >> 2 msb of adjustment needed to convert >> extra 32-bit sign-extended 32-bit tv_sec >> bits time decoded 64-bit tv_sec to decoded 64-bit tv_sec >> 1 1 1 -0x80000000..-1 0 >> 0 0 0 0x000000000..0x07fffffff 0 >> 0 0 1 0x080000000..0x0ffffffff 0x100000000 >> 0 1 0 0x100000000..0x17fffffff 0x100000000 >> 0 1 1 0x180000000..0x1ffffffff 0x200000000 >> 1 0 0 0x200000000..0x27fffffff 0x200000000 >> 1 0 1 0x280000000..0x2ffffffff 0x300000000 >> 1 1 0 0x300000000..0x37fffffff 0x300000000 > > The problem with this encoding is that it requires existing 32-bit > timestamps before 1970 to have encoded "11" in the extra epoch bits, > which is not the case. Current pre-1970 timestamps would be encoded > with "00" there, which would (according to your table) bump them past > 2038 incorrectly. I was under the impression that the encoding code stored bits 33 & 32 of tv_sec there, which would be 1,1 for a negative value like -1. Certainly the decoding would be simpler if the extra value was only non-zero for large timestamps. On closer inspection of ext4_encode_extra_time, it looks like for tv_sec = -1, a 64-bit kernel will store 1,1 in the extra bits and a 32-bit kernel will store 0,0 in the extra bits. That is a problem if both of these need to be decoded as -1 on a 64-bit system. > > What we need is an encoding that preserves the times for extra epoch "00": > > 2 msb of adjustment needed to convert > extra 32-bit sign-extended 32-bit tv_sec > bits time decoded 64-bit tv_sec to decoded 64-bit tv_sec > 0 0 1 -0x80000000..-1 0 > 0 0 0 0x000000000..0x07fffffff 0 > 0 1 1 0x080000000..0x0ffffffff 0x100000000 > 0 1 0 0x100000000..0x17fffffff 0x100000000 > 1 0 1 0x180000000..0x1ffffffff 0x200000000 > 1 0 0 0x200000000..0x27fffffff 0x200000000 > 1 1 1 0x280000000..0x2ffffffff 0x300000000 > 1 1 0 0x300000000..0x37fffffff 0x300000000 > > So, looking at the above desired encoding, it looks like the error in > the existing code is that it is doing a boolean operation on decode > instead of a mathematical one, and it was incorrectly trying to extend > the time to (1ULL<<34). The below should fix this: > > static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra) > { > if (unlikely(sizeof(time->tv_sec) > 4 && > (extra & cpu_to_le32(EXT4_EPOCH_MASK))) > time->tv_sec += (u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK) << 32; > > time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS; > } That is not compatible with the existing ext4_encode_extra_time. For example, 2038-01-31 (0x80101500) is encoded with extra bits equal to bits 33 & 32, i.e. 0,0. But this code would decode it as 1901-12-25 (i.e. it would leave the sign-extended 32-bit value unchanged). Possible solutions: (a) Define the current 64-bit encoding as the correct encoding since the 2 extra bits are not even decoded on 32-bit kernels, so its encoding doesn't matter much. However, if anyone with existing pre-1970 timestamps written using a 32-bit kernel wants to use their ext4 filesystem with a 64-bit kernel, the pre-1970 timestamps would be wrong unless they re-write them with a fixed kernel. Change ext4_decode_extra_time "if" body to something like: time->tv_sec += ((__u32)time->tv_sec + ((__u64)le32_to_cpu(extra) << 32) + 0x80000000LL) & 0x300000000LL; Change ext4_encode_extra_time ": 0" to something like: time->tv_sec < 0 ? EXT4_EPOCH_MASK : 0 (b) Change the encoding of the extra bits to be those in your new table. This is compatible with the 32-bit kernel encoding (which does not decode these bits) but incompatible with the 64-bit kernel encoding. Existing pre-1970 timestamps written with a 64-bit kernel would be decoded as dates far in the future. Requires your change to ext4_decode_extra_time. Also requires a change to ext4_encode_extra_time, changing (time->tv_sec >> 32) to something like: ((time->tv_sec - (signed int)time->tv_sec) >> 32) (c) If 100% compatibility with existing pre-1970 32-bit timestamps is desired even when switching between 32-bit and 64-bit kernels, both extra=1,1/msb=1 and extra=0,0/msb=1 could be treated as year 1901..1969 timestamps. However this would reduce the maximum 64-bit ext4 timestamp, and would necessarily be incompatible with the existing 64-bit kernel encoding of timestamps > year 2038 (since a current 64-bit kernel encodes a year 2039 timestamp exactly the same as a current 32-bit kernel encodes a year 1902 timestamp). This requires additional complexity in both ext4_decode_extra_time and ext4_encode_extra_time. (d) Declare that ext4 supports only timestamps with year >= 1970. i.e. 1970..2514 (64-bit), 1970..2038 (32-bit). Any existing pre-1970 timestamps would now be interpreted as a year >= 2038 timestamp on 64-bit kernels. It may be possible for users of 32-bit kernels to continue to successfully read and write 1901..1969 timestamps, but this would have to be unsupported. If such a timestamp was read with a 64-bit kernel, or a program like fsck.ext4, the time may be different. If some day, as 2038 approaches, 32-bit time_t is changed to unsigned, ext4 would once again support all 32-bit time_t values. To implement, the decoding can simply drop all casts to (signed). Optionally, the encoding could encode any negative tv_sec as 0 to make 32-bit and 64-bit behavior for pre-1970 timestamps consistent (bugzilla 5079/8643). However this may break some uses of pre-1970 timestamps that would otherwise work on 32-bit kernels. - Mark > > Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html