ISO 8601 permits "reduced precision" time representations to omit the seconds value or both the minutes and the seconds values. The abbreviate times could look like 17:45 or 1745 to omit the seconds, or simply as 17 to omit both the minutes and the seconds. parse_date_basic accepts the 17:45 format but it rejects the other two. Change it to accept 4-digit and 2-digit time values when they follow a recognized date and a 'T'. Before this change: $ TZ=UTC test-tool date approxidate 2022-12-13T23:00 2022-12-13T2300 2022-12-13T23 2022-12-13T23:00 -> 2022-12-13 23:00:00 +0000 2022-12-13T2300 -> 2022-12-13 23:54:13 +0000 2022-12-13T23 -> 2022-12-13 23:54:13 +0000 After this change: $ TZ=UTC helper/test-tool date approxidate 2022-12-13T23:00 2022-12-13T2300 2022-12-13T23 2022-12-13T23:00 -> 2022-12-13 23:00:00 +0000 2022-12-13T2300 -> 2022-12-13 23:00:00 +0000 2022-12-13T23 -> 2022-12-13 23:00:00 +0000 Note: ISO 8601 also allows reduced precision date strings such as "2022-12" and "2022". This patch does not attempt to address these. Reported-by: Pat LaVarre <plavarre@xxxxxxxxxxxxxxx> Signed-off-by: Phil Hord <phil.hord@xxxxxxxxx> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@xxxxxxxxx> --- Since this is a complete re-implementation from Phil Hord's version. I'm reassigning the author to me. This version change the implementation to only treat the string as ISO8601 if a 'T' existed and date has been parsed. I also added a test for parsing RFC-822, which Hord accidentally broke. The commit message has been changed: * The example has been changed to be independent from local timezone * Remove the mention of adding test-cases, since it's obviously necessary. date.c | 37 +++++++++++++++++++++++++++++++++++++ t/t0006-date.sh | 8 ++++++++ 2 files changed, 45 insertions(+) diff --git a/date.c b/date.c index 53bd6a7932..6f45eeb356 100644 --- a/date.c +++ b/date.c @@ -493,6 +493,12 @@ static int match_alpha(const char *date, struct tm *tm, int *offset) return 2; } + /* ISO-8601 allows yyyymmDD'T'HHMMSS, with less precision */ + if (*date == 'T' && isdigit(date[1]) && tm->tm_hour == -1) { + tm->tm_min = tm->tm_sec = 0; + return 1; + } + /* BAD CRAP */ return skip_alpha(date); } @@ -638,6 +644,18 @@ static inline int nodate(struct tm *tm) tm->tm_sec) < 0; } +/* + * Have we seen an ISO-8601-alike date, i.e. 20220101T0, + * In which, hour is still unset, + * and minutes and second has been set to 0. + */ +static inline int maybeiso8601(struct tm *tm) +{ + return tm->tm_hour == -1 && + tm->tm_min == 0 && + tm->tm_sec == 0; +} + /* * We've seen a digit. Time? Year? Date? */ @@ -701,6 +719,25 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt return end - date; } + /* reduced precision of ISO-8601's time: HHMM or HH */ + if (maybeiso8601(tm)) { + unsigned int num1 = num; + unsigned int num2 = 0; + if (n == 4) { + num1 = num / 100; + num2 = num % 100; + } + if ((n == 4 || n == 2) && !nodate(tm) && + set_time(num1, num2, 0, tm) == 0) + return n; + /* + * We thought this is an ISO-8601 time string, + * we set minutes and seconds to 0, + * turn out it isn't, rollback the change. + */ + tm->tm_min = tm->tm_sec = -1; + } + /* Four-digit year or a timezone? */ if (n == 4) { if (num <= 1400 && *offset == -1) { diff --git a/t/t0006-date.sh b/t/t0006-date.sh index 2490162071..e18b160286 100755 --- a/t/t0006-date.sh +++ b/t/t0006-date.sh @@ -88,6 +88,13 @@ check_parse 2008-02-14 bad check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000' check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500' check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500' +check_parse '20080214T20:30:45' '2008-02-14 20:30:45 +0000' +check_parse '20080214T20:30' '2008-02-14 20:30:00 +0000' +check_parse '20080214T20' '2008-02-14 20:00:00 +0000' +check_parse '20080214T203045' '2008-02-14 20:30:45 +0000' +check_parse '20080214T2030' '2008-02-14 20:30:00 +0000' +check_parse '20080214T000000.20' '2008-02-14 00:00:00 +0000' +check_parse '20080214T00:00:00.20' '2008-02-14 00:00:00 +0000' check_parse '20080214T203045-04:00' '2008-02-14 20:30:45 -0400' check_parse '20080214T203045 -04:00' '2008-02-14 20:30:45 -0400' check_parse '20080214T203045.019-04:00' '2008-02-14 20:30:45 -0400' @@ -99,6 +106,7 @@ check_parse '2008-02-14 20:30:45 -05' '2008-02-14 20:30:45 -0500' check_parse '2008-02-14 20:30:45 -:30' '2008-02-14 20:30:45 +0000' check_parse '2008-02-14 20:30:45 -05:00' '2008-02-14 20:30:45 -0500' check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 -0500' EST5 +check_parse 'Thu, 7 Apr 2005 15:14:13 -0700' '2005-04-07 15:14:13 -0700' check_approxidate() { echo "$1 -> $2 +0000" >expect -- 2.39.0.287.g690a66fa66