Re: [PATCH v5 0/8] Introduce timestamp_t for timestamps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 24.04.2017 um 15:57 schrieb Johannes Schindelin:
Git v2.9.2 was released in a hurry to accomodate for platforms like
Windows, where the `unsigned long` data type is 32-bit even for 64-bit
setups.

The quick fix was to simply disable all the testing with "absurd" future
dates.

However, we can do much better than that, as we already make use of
64-bit data types internally. There is no good reason why we should not
use the same for timestamps. Hence, let's use uintmax_t for timestamps.

Note: while the `time_t` data type exists and is meant to be used for
timestamps, on 32-bit Linux it is *still* 32-bit. An earlier iteration
used `time_t` for that reason, but it came with a few serious downsides:
as `time_t` can be signed (and indeed, on Windows it is an int64_t),
Git's expectation that 0 is the minimal value does no longer hold true,
introducing its own set of interesting challenges. Besides, if we *can*
handle far in the future timestamps (except for formatting them using
the system libraries), it is more consistent to do so.

time_t is signed on Linux and BSDs as well.

Using an unsigned type gives us the ability to represent times beyond
the 292 billion years in the future that int64_t would give us, but
prevents recording events that occurred before the Epoch.  That doesn't
sound like a good deal to me -- storing historical works (e.g. law
texts) with real time stamps is probably more interesting than fixing
the year 292277026596 problem within this decade.

The upside of using `uintmax_t` for timestamps is that we do a much
better job to support far in the future timestamps across all platforms,
including 32-bit ones. The downside is that those platforms that use a
32-bit `time_t` will barf when parsing or formatting those timestamps.

IIUC this series has two aims: solving the year 2038 problem on 32-bit
Linux by replacing time_t (int32_t), and solving the year 2106 problem
on Windows by replacing unsigned long (uint32_t), right?  The latter one
sounds more interesting, because 32-bit platforms would still be unable
to fully use bigger time values as you wrote above.

Can we leave time_t alone and just do the part where you replace
unsigned long with timestamp_t defined as uint64_t?  That should already
help on Windows, correct?  When/if timestamp_t is later changed to a
signed type then we could easily convert the time_t cases to timestamp_t
as well, or the other way around.

This iteration makes the date_overflows() check more stringent again.

It is arguably a bug to paper over too-large author/committer dates and
to replace them with Jan 1 1970 without even telling the user that we do
that, but this is the behavior that t4212 verifies, so I reinstated that
behavior. The change in behavior was missed because of the missing
unsigned_add_overflows() test.

I can't think of many ways to get future time stamps (broken clock,
broken CMOS battery, bit rot, time travel), so I wouldn't expect a
change towards better error reporting to affect a lot of users.  (Not
necessarily as part of this series, of course.)

Changes since v4:

- in gm_time_t(), we now test specifically that the timezone adjustment
   neither underflows nor overflows.

- the patch introduced in v4 that tried to defer the date_overflows()
   check to gm_time_t() rather than replacing the ident timestamp by a 0
   without any warning was dropped again: it broke t4212.


Johannes Schindelin (8):
   ref-filter: avoid using `unsigned long` for catch-all data type
   t0006 & t5000: prepare for 64-bit timestamps
   t0006 & t5000: skip "far in the future" test when time_t is too
     limited
   Specify explicitly where we parse timestamps
   Introduce a new "printf format" for timestamps
   Introduce a new data type for timestamps
   Abort if the system time cannot handle one of our timestamps
   Use uintmax_t for timestamps

  Documentation/technical/api-parse-options.txt |   8 +-
  archive-tar.c                                 |   5 +-
  archive-zip.c                                 |  12 ++-
  archive.h                                     |   2 +-
  builtin/am.c                                  |   4 +-
  builtin/blame.c                               |  14 ++--
  builtin/fsck.c                                |   6 +-
  builtin/gc.c                                  |   2 +-
  builtin/log.c                                 |   4 +-
  builtin/merge-base.c                          |   2 +-
  builtin/name-rev.c                            |   6 +-
  builtin/pack-objects.c                        |   4 +-
  builtin/prune.c                               |   4 +-
  builtin/receive-pack.c                        |  14 ++--
  builtin/reflog.c                              |  24 +++---
  builtin/rev-list.c                            |   2 +-
  builtin/rev-parse.c                           |   2 +-
  builtin/show-branch.c                         |   4 +-
  builtin/worktree.c                            |   4 +-
  bundle.c                                      |   4 +-
  cache.h                                       |  14 ++--
  commit.c                                      |  18 ++--
  commit.h                                      |   2 +-
  config.c                                      |   2 +-
  credential-cache--daemon.c                    |  12 +--
  date.c                                        | 113 ++++++++++++++------------
  fetch-pack.c                                  |   8 +-
  fsck.c                                        |   2 +-
  git-compat-util.h                             |   9 ++
  http-backend.c                                |   4 +-
  parse-options-cb.c                            |   4 +-
  pretty.c                                      |   4 +-
  reachable.c                                   |   9 +-
  reachable.h                                   |   4 +-
  ref-filter.c                                  |  22 ++---
  reflog-walk.c                                 |   8 +-
  refs.c                                        |  14 ++--
  refs.h                                        |   8 +-
  refs/files-backend.c                          |   8 +-
  revision.c                                    |   6 +-
  revision.h                                    |   4 +-
  sha1_name.c                                   |   6 +-
  t/helper/test-date.c                          |  18 ++--
  t/helper/test-parse-options.c                 |   4 +-
  t/helper/test-ref-store.c                     |   4 +-
  t/t0006-date.sh                               |   4 +-
  t/t5000-tar-tree.sh                           |   6 +-
  t/test-lib.sh                                 |   3 +
  tag.c                                         |   6 +-
  tag.h                                         |   2 +-
  upload-pack.c                                 |   8 +-
  vcs-svn/fast_export.c                         |   8 +-
  vcs-svn/fast_export.h                         |   4 +-
  vcs-svn/svndump.c                             |   2 +-
  wt-status.c                                   |   2 +-
  55 files changed, 260 insertions(+), 219 deletions(-)

How did you find all the pieces of code that need to be touched?  Is
there a regex or something that can be used to spot new such places
that sneak in, e.g. through in-flight merges?

base-commit: e2cb6ab84c94f147f1259260961513b40c36108a
Published-As: https://github.com/dscho/git/releases/tag/time_t-may-be-int64-v5
Fetch-It-Via: git fetch https://github.com/dscho/git time_t-may-be-int64-v5

Interdiff vs v4:

  diff --git a/date.c b/date.c
  index 75f6335cd09..63fa99685e2 100644
  --- a/date.c
  +++ b/date.c
  @@ -47,9 +47,16 @@ static time_t gm_time_t(timestamp_t time, int tz)
   	minutes = (minutes / 100)*60 + (minutes % 100);
   	minutes = tz < 0 ? -minutes : minutes;
- if (date_overflows(time + minutes * 60))
  +	if (minutes > 0) {
  +		if (unsigned_add_overflows(time, minutes * 60))
  +			die("Timestamp+tz too large: %"PRItime" +%04d",
  +			    time, tz);
  +	} else if (time < -minutes * 60)
  +		die("Timestamp before Unix epoch: %"PRItime" %04d", time, tz);
  +	time += minutes * 60;
  +	if (date_overflows(time))
   		die("Timestamp too large for this system: %"PRItime, time);
  -	return (time_t)time + minutes * 60;
  +	return (time_t)time;
   }
/*
  diff --git a/pretty.c b/pretty.c
  index 35fd290096a..587d48371b0 100644
  --- a/pretty.c
  +++ b/pretty.c
  @@ -410,10 +410,14 @@ const char *show_ident_date(const struct ident_split *ident,
if (ident->date_begin && ident->date_end)
   		date = parse_timestamp(ident->date_begin, NULL, 10);
  -	if (ident->tz_begin && ident->tz_end)
  -		tz = strtol(ident->tz_begin, NULL, 10);
  -	if (tz >= INT_MAX || tz <= INT_MIN)
  -		tz = 0;
  +	if (date_overflows(date))
  +		date = 0;
  +	else {
  +		if (ident->tz_begin && ident->tz_end)
  +			tz = strtol(ident->tz_begin, NULL, 10);
  +		if (tz >= INT_MAX || tz <= INT_MIN)
  +			tz = 0;
  +	}
   	return show_date(date, tz, mode);
   }



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]