Re: [PATCH] git-p4: improve performance with large files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 05, 2009, Han-Wen Nienhuys wrote:

> i'd say
> 
>   data = []
> 
> add a comment that you're trying to save memory. There is no reason to
> remove data from the namespace.

   Okay. Here is an improved version.

Signed-off-by: Sam Hocevar <sam@xxxxxxx>
---
 contrib/fast-import/git-p4 |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/contrib/fast-import/git-p4 b/contrib/fast-import/git-p4
index 3832f60..db0ea0a 100755
--- a/contrib/fast-import/git-p4
+++ b/contrib/fast-import/git-p4
@@ -984,11 +984,22 @@ class P4Sync(Command):
         while j < len(filedata):
             stat = filedata[j]
             j += 1
+            data = []
             text = ''
+            # Append data every 8192 chunks to 1) ensure decent performance
+            # by not making too many string concatenations and 2) avoid
+            # excessive memory usage by purging "data" often enough. p4
+            # sends 4k chunks, so we should not use more than 32 MiB of
+            # additional memory while rebuilding the file data.
             while j < len(filedata) and filedata[j]['code'] in ('text', 'unicod
e', 'binary'):
-                text += filedata[j]['data']
+                data.append(filedata[j]['data'])
                 del filedata[j]['data']
+                if len(data) > 8192:
+                    text += ''.join(data)
+                    data = []
                 j += 1
+            text += ''.join(data)
+            data = None

             if not stat.has_key('depotFile'):
                 sys.stderr.write("p4 print fails with: %s\n" % repr(stat))

-- 
Sam.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux