Re: [PATCH] git-p4: improve performance with large files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Sam Hocevar <sam@xxxxxxx>:

   The current git-p4 way of concatenating strings performs in O(n^2)
and is therefore terribly slow with large files because of unnecessary
memory copies. The following patch makes the operation O(n).

The reason why it uses simple concatenation is to cut down on memory usage.
 - It is a tradeoff.

I think the modification you have made below is reasonable, however be aware that memory usage could double, which substantially reduce the size of the changesets that git-p4 would be able to import /at all/, rather than to merely be slow.

That said, you do need to delete the data temporary array to cut down on memory.
 -- I would do this immediately after the "".join(data).


   Using this patch, importing a 17GB repository with large files
(50 to 500MB) takes 2 hours instead of a week.

Signed-off-by: Sam Hocevar <sam@xxxxxxx>
---
 contrib/fast-import/git-p4 |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/contrib/fast-import/git-p4 b/contrib/fast-import/git-p4
index 9fdb0c6..09e9746 100755
--- a/contrib/fast-import/git-p4
+++ b/contrib/fast-import/git-p4
@@ -990,11 +990,12 @@ class P4Sync(Command):
         while j < len(filedata):
             stat = filedata[j]
             j += 1
-            text = ''
+            data = []
while j < len(filedata) and filedata[j]['code'] in ('text', 'unicode', 'binary'):
-                text += filedata[j]['data']
+                data.append(filedata[j]['data'])
                 del filedata[j]['data']
                 j += 1
+            text = "".join(data)

             if not stat.has_key('depotFile'):
                 sys.stderr.write("p4 print fails with: %s\n" % repr(stat))
--
1.6.1.3
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux