Re: [PATCH 08/11] xfs: widen ondisk timestamps to deal with y2038 problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 20, 2020 at 3:03 AM Darrick J. Wong <darrick.wong@xxxxxxxxxx> wrote:
>
> On Wed, Aug 19, 2020 at 02:43:22PM -0700, Darrick J. Wong wrote:
> > On Wed, Aug 19, 2020 at 09:35:35AM +1000, Dave Chinner wrote:
> > > On Mon, Aug 17, 2020 at 03:57:39PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > >
> > > > Redesign the ondisk timestamps to be a simple unsigned 64-bit counter of
> > > > nanoseconds since 14 Dec 1901 (i.e. the minimum time in the 32-bit unix
> > > > time epoch).  This enables us to handle dates up to 2486, which solves
> > > > the y2038 problem.
> > > >
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > > ---
> > > .....
> > > > +/* Convert an ondisk timestamp into the 64-bit safe incore format. */
> > > >  void
> > > >  xfs_inode_from_disk_timestamp(
> > > > + struct xfs_dinode               *dip,
> > > >   struct timespec64               *tv,
> > > >   const union xfs_timestamp       *ts)
> > > >  {
> > > > + if (dip->di_version >= 3 &&
> > > > +     (dip->di_flags2 & cpu_to_be64(XFS_DIFLAG2_BIGTIME))) {
> > > > +         uint64_t                t = be64_to_cpu(ts->t_bigtime);
> > > > +         uint64_t                s;
> > > > +         uint32_t                n;
> > > > +
> > > > +         s = div_u64_rem(t, NSEC_PER_SEC, &n);
> > > > +         tv->tv_sec = s - XFS_INO_BIGTIME_EPOCH;
> > > > +         tv->tv_nsec = n;
> > > > +         return;
> > > > + }
> > > > +
> > > >   tv->tv_sec = (int)be32_to_cpu(ts->t_sec);
> > > >   tv->tv_nsec = (int)be32_to_cpu(ts->t_nsec);
> > > >  }
> > >
> > > Can't say I'm sold on this union. It seems cleaner to me to just
> > > make the timestamp an opaque 64 bit field on disk and convert it to
> > > the in-memory representation directly in the to/from disk
> > > operations. e.g.:
> > >
> > > void
> > > xfs_inode_from_disk_timestamp(
> > >     struct xfs_dinode               *dip,
> > >     struct timespec64               *tv,
> > >     __be64                          ts)
> > > {
> > >
> > >     uint64_t                t = be64_to_cpu(ts);
> > >     uint64_t                s;
> > >     uint32_t                n;
> > >
> > >     if (xfs_dinode_is_bigtime(dip)) {
> > >             s = div_u64_rem(t, NSEC_PER_SEC, &n) - XFS_INO_BIGTIME_EPOCH;
> > >     } else {
> > >             s = (int)(t >> 32);
> > >             n = (int)(t & 0xffffffff);
> > >     }
> > >     tv->tv_sec = s;
> > >     tv->tv_nsec = n;
> > > }
> >
> > I don't like this open-coded union approach at all because now I have to
> > keep the t_sec and t_nsec bits separate in my head instead of letting
> > the C compiler take care of that detail.  The sample code above doesn't
> > handle that correctly either:
> >
> > Start with an old kernel on a little endian system; each uppercase
> > letter represents a byte (A is the LSB of t_sec, D is the MSB of t_sec,
> > E is the LSB of t_nsec, and H is the MSB of t_nsec):
> >
> >       sec  nsec (incore)
> >       ABCD EFGH
> >
> > That gets written out as:
> >
> >       sec  nsec (ondisk)
> >       DCBA HGFE
> >
> > Now reboot with a new kernel that only knows 64bit timestamps on disk:
> >
> >       64bit (ondisk)
> >       DCBAHGFE
> >
> > Now it does the first be64_to_cpu conversion:
> >       64bit (incore)
> >       EFGHABCD
> >
> > And then masks and shifts:
> >       sec  nsec (incore)
> >       EFGH ABCD
> >
> > Oops, we just switched the values!
> >
> > The correct approach (I think) is to perform the shifting and masking on
> > the raw __be64 value before converting them to incore format via
> > be32_to_cpu, but now I have to work out all four cases by hand instead
> > of letting the compiler do the legwork for me.  I don't remember if it's
> > correct to go around shifting and masking __be64 values.
> >
> > I guess the good news is that at least we have generic/402 to catch
> > these kinds of persistence problems, but ugh.
> >
> > Anyway, what are you afraid of?  The C compiler smoking crack and not
> > actually overlapping the two union elements?  We could control for
> > that...
>
> (Following up on the mailing list with something I pasted into IRC)
>
> Ok, so I temporarily patched up my dev tree with this approximation of
> how that would work, properly done:
>
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index c59ddb56bb90..7c71e4440402 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -176,8 +176,8 @@ xfs_inode_from_disk_timestamp(
>                 return;
>         }
>
> -       tv->tv_sec = (int)be32_to_cpu(ts->t_sec);
> -       tv->tv_nsec = (int)be32_to_cpu(ts->t_nsec);
> +       tv->tv_sec = (time64_t)be32_to_cpu((__be32)(ts->t_bigtime >> 32));
> +       tv->tv_nsec = be32_to_cpu(ts->t_bigtime & 0xFFFFFFFFU);
>  }
>
>  int
> @@ -294,8 +294,8 @@ xfs_inode_to_disk_timestamp(
>                 return;
>         }
>
> -       ts->t_sec = cpu_to_be32(tv->tv_sec);
> -       ts->t_nsec = cpu_to_be32(tv->tv_nsec);
> +       ts->t_bigtime = (__be64)cpu_to_be32(tv->tv_sec) << 32 |
> +                       cpu_to_be32(tv->tv_nsec);
>  }
>
>  void
> diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
> index d44e8932979b..5d36d6dea326 100644
> --- a/fs/xfs/xfs_inode_item.c
> +++ b/fs/xfs/xfs_inode_item.c
> @@ -308,8 +308,8 @@ xfs_log_dinode_to_disk_ts(
>                 return;
>         }
>
> -       ts->t_sec = cpu_to_be32(its->t_sec);
> -       ts->t_nsec = cpu_to_be32(its->t_nsec);
> +       ts->t_bigtime = (__be64)cpu_to_be32(its->t_sec) << 32 |
> +                       cpu_to_be32(its->t_nsec);
>  }
>
> And immediately got a ton of smatch warnings:
>
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:297:26: warning: cast to restricted __be64
> xfs_inode_buf.c:297:26: warning: cast from restricted __be32
> xfs_inode_buf.c:297:26: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:298:25: warning: restricted __be32 degrades to integer
> xfs_inode_buf.c:297:23: warning: incorrect type in assignment (different base types)
> xfs_inode_buf.c:297:23:    expected restricted __be64 [usertype] t_bigtime
> xfs_inode_buf.c:297:23:    got unsigned long long
>
> (and even more in xfs_inode_item.c)
>
> So... while we could get rid of the union and hand-decode the timestamp
> from a __be64 on legacy filesystems, I see the static checker complaints
> as a second piece of evidence that this would be unnecessarily risky.
>

And unnecessarily make the code less readable and harder to review.
To what end? Dave writes:
"I just didn't really like the way the code in the encode/decode
helpers turned out..."

Cannot respond to that argument on a technical review.
I can only say that as a reviewer, the posted version was clear and easy
for me to verify and the posted alternative that turned out to have a bug,
I would never have never caught that bug in review and I would not have
felt confident about verifying the code in review either.

Thanks,
Amir.



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux