Thanks for all the input and patience. On Fri, Feb 22, 2013 at 10:34 AM, Jeff King <peff@xxxxxxxx> wrote: > On Fri, Feb 22, 2013 at 09:30:57AM -0800, Joshua Clayton wrote: > >> Read and write each 1024 byte buffer, rather than trying to buffer >> the entire content of the file. > > OK. Did you ever repeat your timing with a larger symmetric buffer? That > should probably be a separate patch on top, but it might be worth doing > while we are thinking about it. > >> Previous code would crash on all files > 2 Gib, when the offset variable >> became negative (perhaps below the level of perl), resulting in a crash. > > I'm still slightly dubious of this, just because it doesn't match my > knowledge of perl (which is admittedly imperfect). I'm curious how you > diagnosed it? I first had the memory exhaustion problem running my git repo on a 32 vm. After bumping the memory from 512 to 4 GiB, and that failing to fix it I moved to my workstation with 16 GiB ...reproduced After the initial crash, I added print $size, " ", $bytesToRead, " ", $bytesRead, "\n"; right before the read command, and it does indeed crash right after the $bytesRead variable crosses LONG_MAX ... 2567089913 1024 2147482624 2567089913 1024 2147483648 2567089913 1024 2147484672 Offset outside string at /usr/share/perl5/Git.pm line 901, <GEN36> line 2604. Note that $bytesRead is still positive. I know very little perl, but that symptom seems pretty clear > >> On a 32 bit system, or a system with low memory it might crash before >> reaching 2 GiB due to memory exhaustion. >> >> Signed-off-by: Joshua Clayton <stillcompiling@xxxxxxxxx> >> Reviewed-by: Jeff King <peff@xxxxxxxx> > > The commit message is a good place to mention any side effects, and why > they are not a problem. Something like: > > The previous code buffered the whole blob before writing, so any error > reading from cat-file would result in zero bytes being written to the > output stream. After this change, the output may be left in a > partially written state (or even fully written, if we fail when > parsing the final newline from cat-file). However, it's not reasonable > for callers to expect anything about the state of the output when we > return an error (after all, even with full buffering, we might fail > during the writing process). So any caller which cares about this is > broken already, and we do not have to worry about them. > >> --- >> perl/Git.pm | 12 +++++------- >> 1 file changed, 5 insertions(+), 7 deletions(-) > > The patch itself looks fine to me. > > -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html