[PATCH] git-p4: improve performance with large files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



   The current git-p4 way of concatenating strings performs in O(n^2)
and is therefore terribly slow with large files because of unnecessary
memory copies. The following patch makes the operation O(n).

   Using this patch, importing a 17GB repository with large files
(50 to 500MB) takes 2 hours instead of a week.

Signed-off-by: Sam Hocevar <sam@xxxxxxx>
---
 contrib/fast-import/git-p4 |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/contrib/fast-import/git-p4 b/contrib/fast-import/git-p4
index 9fdb0c6..09e9746 100755
--- a/contrib/fast-import/git-p4
+++ b/contrib/fast-import/git-p4
@@ -990,11 +990,12 @@ class P4Sync(Command):
         while j < len(filedata):
             stat = filedata[j]
             j += 1
-            text = ''
+            data = []
             while j < len(filedata) and filedata[j]['code'] in ('text', 'unicode', 'binary'):
-                text += filedata[j]['data']
+                data.append(filedata[j]['data'])
                 del filedata[j]['data']
                 j += 1
+            text = "".join(data)
 
             if not stat.has_key('depotFile'):
                 sys.stderr.write("p4 print fails with: %s\n" % repr(stat))
-- 
1.6.1.3
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux