Re: git-p4 fails when cloning a p4 depo.

"Benjamin Sergeant" <bsergean@xxxxxxxxx> · Sun, 17 Jun 2007 09:09:00 -0700

On 6/17/07, Simon Hausmann <simon@xxxxxx> wrote:
On Tuesday 12 June 2007 03:13:17 Han-Wen Nienhuys wrote:
> Benjamin Sergeant escreveu:
> > A perforce command with all the files in the repo is generated to get
> > all the file content.
> > Here is a patch to break it into multiple successive perforce command
> > who uses 4K of parameter max, and collect the output for later.
> >
> > It works, but not for big depos, because the whole perforce depo
> > content is stored in memory in P4Sync.run(), and it looks like mine is
> > bigger than 2 Gigs, so I had to kill the process.
>
> General idea of the patch is ok.  some nits:
> > +        chunk = ''
> > +        filedata = []
> > +        for i in xrange(len(files)):
>
> why not
>
>   for f in files:
>
> ?

It seems 'i' is used a bit later. Is there a nicer way to express this in
python?

> > +            f = files[i]
> > +            chunk += '"%s#%s" ' % (f['path'], f['rev'])
> > +            if len(chunk) > 4000 or i == len(files)-1:
>
> 4k seems reasonable enough, but can you take the min() with
> os.sysconf('SC_ARG_MAX') ?
>
> Can you address this and resend so we can apply the patch?
> Thanks.

Since I ran into the very problem of a too long commandline myself yesterday I
took the liberty of adding the SC_ARG_MAX bit to Benjamin's patch and
comitting it then.

Cool.

(probably useless but)
For what it's worth ; Here is a tar file with 2 patchs:
- The original one
- The second one that adds the SC_ARG_MAX

BTW, doing a += on a string is not supposed to be fast, appending elem
to a sequence and then using ' '.join on them to build the big string
is said to be faster (I did not timed it thought). (in the second
patch also)

Thanks,
Benjamin.

Simon

Attachment:
p4-sync-chunks.tar

Description: Unix tar archive