Re: [PATCH] Fix in Git.pm cat_blob crashes on large files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



running git svn fetch on a remote repository (yes I know there are a
lot of possible outside variables, including network latency)
Code with 1024 reads and 64k writes:

real    75m19.906s
user    16m43.919s
sys     29m16.326s

Code with 1024 reads and 1024 writes:

real    71m21.006s
user    12m36.275s
sys     24m26.112s

...so the simpler code wins the trivial test.
I would say go with it.
Should I resubmit?

On Thu, Feb 21, 2013 at 3:24 PM, Jeff King <peff@xxxxxxxx> wrote:
> On Thu, Feb 21, 2013 at 03:18:40PM -0800, Joshua Clayton wrote:
>
>> > By having the read and flush size be the same, it's much simpler.
>>
>> My original bugfix did just read 1024, and write 1024. That works fine
>> and, yes, is simpler.
>> I changed it to be more similar to the original code in case there
>> were performance reasons for doing it that way.
>> That was the only reason I could think of for the design, and adding
>> the $flushSize variable means that
>> some motivated person could easily optimize it.
>>
>> So far I have been too lazy to profile the two versions....
>> I guess I'll try a trivial git svn init; git svn fetch and check back in.
>> Its running now.
>
> I doubt it will make much of a difference; we are already writing to a
> perl filehandle, so it will be buffered there (which I assume is 4K, but
> I haven't checked). And your version retains the 1024-byte read. I do
> think 1024 is quite low for this sort of descriptor-to-descriptor
> copying. I'd be tempted to just bump that 1024 to 64K.
>
>> In git svn fetch (which is how I discovered it) the file being passed
>> to cat_blob is a temporary file, which is checksummed before putting
>> it into place.
>
> Great. There may be other callers outside of our tree, of course, but I
> think it's pretty clear that the responsibility is on the caller to make
> sure the function succeeded. We are changing what gets put on the output
> stream for various error conditions, but ultimately that is an
> implementation detail that the caller should not be depending on.
>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]