[PATCH v2] git-p4: fix git-p4.pathEncoding for removed files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In a9e38359e3 we taught git-p4 a way to re-encode path names from what
was used in Perforce to UTF-8. This path re-encoding worked properly for
"added" paths. "Removed" paths were not re-encoded and therefore
different from the "added" paths. Consequently, these files were not
removed in a git-p4 cloned Git repository because the path names did not
match.

Fix this by moving the re-encoding to a place that affects "added" and
"removed" paths. Add a test to demonstrate the issue.

Signed-off-by: Lars Schneider <larsxschneider@xxxxxxxxx>
---

Hi,

unfortunately, I missed to send this v2. I agree with Luke's review and
I moved the re-encode of the path name to the `streamOneP4File` and
`streamOneP4Deletion` explicitly.

Discussion:
http://public-inbox.org/git/CAE5ih7-=bD_ZoL5pFYfD2Qvy-XE24V_cgge0XoAvuoTK02EDfg@xxxxxxxxxxxxxx/

Thanks,
Lars


Notes:
    Base Commit: 454cb6bd52 (v2.11.0)
    Diff on Web: https://github.com/larsxschneider/git/commit/75ed3e92e2
    Checkout:    git fetch https://github.com/larsxschneider/git git-p4/fix-path-encoding-v2 && git checkout 75ed3e92e2

    Interdiff (v1..v2):

    diff --git a/git-p4.py b/git-p4.py
    index 8f311cb4e8..dac8b4955d 100755
    --- a/git-p4.py
    +++ b/git-p4.py
    @@ -2366,15 +2366,6 @@ class P4Sync(Command, P4UserMap):
                         break

             path = wildcard_decode(path)
    -        try:
    -            path.decode('ascii')
    -        except:
    -            encoding = 'utf8'
    -            if gitConfig('git-p4.pathEncoding'):
    -                encoding = gitConfig('git-p4.pathEncoding')
    -            path = path.decode(encoding, 'replace').encode('utf8', 'replace')
    -            if self.verbose:
    -                print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, path)
             return path

         def splitFilesIntoBranches(self, commit):
    @@ -2427,11 +2418,24 @@ class P4Sync(Command, P4UserMap):
                 self.gitStream.write(d)
             self.gitStream.write('\n')

    +    def encodeWithUTF8(self, path):
    +        try:
    +            path.decode('ascii')
    +        except:
    +            encoding = 'utf8'
    +            if gitConfig('git-p4.pathEncoding'):
    +                encoding = gitConfig('git-p4.pathEncoding')
    +            path = path.decode(encoding, 'replace').encode('utf8', 'replace')
    +            if self.verbose:
    +                print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, path)
    +        return path
    +
         # output one file from the P4 stream
         # - helper for streamP4Files

         def streamOneP4File(self, file, contents):
             relPath = self.stripRepoPath(file['depotFile'], self.branchPrefixes)
    +        relPath = self.encodeWithUTF8(relPath)
             if verbose:
                 size = int(self.stream_file['fileSize'])
                 sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], relPath, size/1024/1024))
    @@ -2511,6 +2515,7 @@ class P4Sync(Command, P4UserMap):

         def streamOneP4Deletion(self, file):
             relPath = self.stripRepoPath(file['path'], self.branchPrefixes)
    +        relPath = self.encodeWithUTF8(relPath)
             if verbose:
                 sys.stdout.write("delete %s\n" % relPath)
                 sys.stdout.flush()

 git-p4.py                       | 24 ++++++++++++++----------
 t/t9822-git-p4-path-encoding.sh | 16 ++++++++++++++++
 2 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index fd5ca52462..dac8b4955d 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2418,11 +2418,24 @@ class P4Sync(Command, P4UserMap):
             self.gitStream.write(d)
         self.gitStream.write('\n')

+    def encodeWithUTF8(self, path):
+        try:
+            path.decode('ascii')
+        except:
+            encoding = 'utf8'
+            if gitConfig('git-p4.pathEncoding'):
+                encoding = gitConfig('git-p4.pathEncoding')
+            path = path.decode(encoding, 'replace').encode('utf8', 'replace')
+            if self.verbose:
+                print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, path)
+        return path
+
     # output one file from the P4 stream
     # - helper for streamP4Files

     def streamOneP4File(self, file, contents):
         relPath = self.stripRepoPath(file['depotFile'], self.branchPrefixes)
+        relPath = self.encodeWithUTF8(relPath)
         if verbose:
             size = int(self.stream_file['fileSize'])
             sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], relPath, size/1024/1024))
@@ -2495,16 +2508,6 @@ class P4Sync(Command, P4UserMap):
             text = regexp.sub(r'$\1$', text)
             contents = [ text ]

-        try:
-            relPath.decode('ascii')
-        except:
-            encoding = 'utf8'
-            if gitConfig('git-p4.pathEncoding'):
-                encoding = gitConfig('git-p4.pathEncoding')
-            relPath = relPath.decode(encoding, 'replace').encode('utf8', 'replace')
-            if self.verbose:
-                print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, relPath)
-
         if self.largeFileSystem:
             (git_mode, contents) = self.largeFileSystem.processContent(git_mode, relPath, contents)

@@ -2512,6 +2515,7 @@ class P4Sync(Command, P4UserMap):

     def streamOneP4Deletion(self, file):
         relPath = self.stripRepoPath(file['path'], self.branchPrefixes)
+        relPath = self.encodeWithUTF8(relPath)
         if verbose:
             sys.stdout.write("delete %s\n" % relPath)
             sys.stdout.flush()
diff --git a/t/t9822-git-p4-path-encoding.sh b/t/t9822-git-p4-path-encoding.sh
index 7b83e696a9..c78477c19b 100755
--- a/t/t9822-git-p4-path-encoding.sh
+++ b/t/t9822-git-p4-path-encoding.sh
@@ -51,6 +51,22 @@ test_expect_success 'Clone repo containing iso8859-1 encoded paths with git-p4.p
 	)
 '

+test_expect_success 'Delete iso8859-1 encoded paths and clone' '
+	(
+		cd "$cli" &&
+		ISO8859="$(printf "$ISO8859_ESCAPED")" &&
+		p4 delete "$ISO8859" &&
+		p4 submit -d "remove file"
+	) &&
+	git p4 clone --destination="$git" //depot@all &&
+	test_when_finished cleanup_git &&
+	(
+		cd "$git" &&
+		git -c core.quotepath=false ls-files >actual &&
+		test_must_be_empty actual
+	)
+'
+
 test_expect_success 'kill p4d' '
 	kill_p4d
 '

base-commit: 454cb6bd52a4de614a3633e4f547af03d5c3b640
--
2.11.0




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]