[PATCH] fast-import: accept invalid timezones so we can import existing repos

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Elijah Newren <newren@xxxxxxxxx>

There are multiple repositories in the wild with random, invalid
timezones.  Most notably is a commit from rails.git with a timezone of
"+051800"[1].  However, a few searches found other repos with that same
invalid timezone.  Further, Peff reports that GitHub relaxed their fsck
checks in August 2011 to accept any timezone value[2], and there have
been multiple reports to filter-repo about fast-import crashing while
trying to import their existing repositories since they had timezone
values such as "-7349423" and "-43455309"[3].

It is not clear what created these invalid timezones, but since git has
permitted their use and worked with these repositories for years at this
point, it seems pointless to make fast-import be the only thing that
disallows them.  Relax the parsing to allow these timezones when using
raw import format; when using --date-format=rfc2822 (which is not the
default), we can continue being more strict about timezones.

[1] https://github.com/rails/rails/commit/4cf94979c9f4d6683c9338d694d5eb3106a4e734
[2] https://lore.kernel.org/git/20200521195513.GA1542632@xxxxxxxxxxxxxxxxxxxxxxx/
[3] https://github.com/newren/git-filter-repo/issues/88

Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
---
    fast-import: accept invalid timezones so we can import existing repos
    
    Note: this is not a fix for a regression, so you can ignore it for 2.27;
    it can sit in pu.
    
    Peff leaned towards normalizing these timezones in fast-export[1], but
    (A) it's not clear what "valid" timezone we should randomly pick and
    regardless of what we pick, it seems it'll be wrong for most cases, (B)
    it would provide yet another way that "git fast-export --all | git
    fast-import" would not preserve the original history, as users sometimes
    expect[2], and (C) that'd prevent users from passing a special callback
    to filter-repo to fix up these values[3].
    
    Since I'm not a fan of picking a random value to reassign these to (in
    either fast-export or fast-import), I went the route of relaxing the
    requirements in fast-import, similar to what Peff reports GitHub did
    about 9 years ago in their incoming fsck checks.
    
    [1] 
    https://lore.kernel.org/git/20200521195513.GA1542632@xxxxxxxxxxxxxxxxxxxxxxx/
    [2] 
    https://lore.kernel.org/git/CABPp-BFLJ48BZ97Y9mr4i3q7HMqjq18cXMgSYdxqD1cMzH8Spg@xxxxxxxxxxxxxx/
    [3] Example: 
    https://github.com/newren/git-filter-repo/issues/88#issuecomment-629706776

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-795%2Fnewren%2Floosen-fast-import-timezone-parsing-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-795/newren/loosen-fast-import-timezone-parsing-v1
Pull-Request: https://github.com/git/git/pull/795

 fast-import.c          |  7 +++----
 t/t9300-fast-import.sh | 17 +++++++++++++++++
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index c98970274c4..4a3c193007d 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1915,11 +1915,10 @@ static int validate_raw_date(const char *src, struct strbuf *result)
 {
 	const char *orig_src = src;
 	char *endp;
-	unsigned long num;
 
 	errno = 0;
 
-	num = strtoul(src, &endp, 10);
+	strtoul(src, &endp, 10);
 	/* NEEDSWORK: perhaps check for reasonable values? */
 	if (errno || endp == src || *endp != ' ')
 		return -1;
@@ -1928,8 +1927,8 @@ static int validate_raw_date(const char *src, struct strbuf *result)
 	if (*src != '-' && *src != '+')
 		return -1;
 
-	num = strtoul(src + 1, &endp, 10);
-	if (errno || endp == src + 1 || *endp || 1400 < num)
+	strtoul(src + 1, &endp, 10);
+	if (errno || endp == src + 1 || *endp)
 		return -1;
 
 	strbuf_addstr(result, orig_src);
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 768257b29e0..0e798e68476 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -410,6 +410,23 @@ test_expect_success 'B: accept empty committer' '
 	test -z "$out"
 '
 
+test_expect_success 'B: accept invalid timezone' '
+	cat >input <<-INPUT_END &&
+	commit refs/heads/invalid-timezone
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 1234567890 +051800
+	data <<COMMIT
+	empty commit
+	COMMIT
+	INPUT_END
+
+	test_when_finished "git update-ref -d refs/heads/invalid-timezone
+		git gc
+		git prune" &&
+	git fast-import <input &&
+	git cat-file -p invalid-timezone >out &&
+	grep "1234567890 [+]051800" out
+'
+
 test_expect_success 'B: accept and fixup committer with no name' '
 	cat >input <<-INPUT_END &&
 	commit refs/heads/empty-committer-2

base-commit: 2d5e9f31ac46017895ce6a183467037d29ceb9d3
-- 
gitgitgadget



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux