On Aug 19, 2013, at 10:16, Junio C Hamano wrote:
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
So why isn't the patch much more straightforward? Like the attached
totally untested one that just limits the read/write size to 8MB
(which is totally arbitrary, but small enough to not have any latency
issues even on slow disks, and big enough that any reasonable IO
subsystem will still get good throughput).
Ahh. OK, not noticing EINVAL unconditionally, but always feed IOs
in chunks that are big enough for sane systems but small enough for
broken ones.
That makes sense. Could somebody on MacOS X test this?
I tested this on both i386 (OS X 32-bit intel) and x86_64 (OS X 64-bit
intel).
What I tested:
1. I started with branch pu:
(965adb10 Merge branch 'sg/bash-prompt-lf-in-cwd-test' into pu)
2. I added Steffen's additional test (modified to always run) to t0021:
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index e50f0f7..b92e6cb 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -190,4 +190,16 @@ test_expect_success 'required filter clean
failure' '
test_must_fail git add test.fc
'
+test_expect_success 'filter large file' '
+ git config filter.largefile.smudge cat &&
+ git config filter.largefile.clean cat &&
+ for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&
+ echo "2GB filter=largefile" >.gitattributes &&
+ git add 2GB 2>err &&
+ ! test -s err &&
+ rm -f 2GB &&
+ git checkout -- 2GB 2>err &&
+ ! test -s err
+'
+
test_done
3. I verified that the test fails with an unpatched build on both 32-
bit and 64-bit.
4. I applied Linus's unmodified patch to wrapper.c.
5. I tested again. The t0021 test now passes on 64-bit. It still
fails on 32-bit for another reason unrelated to Linus's patch.
It fails when attempting the "git add 2GB" line from the new 'filter
large file' part of the test. The failure with backtrace:
git(16806,0xa095c720) malloc: *** mmap(size=2147487744) failed (error
code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
# NOTE: error code 12 is ENOMEM on OS X
Breakpoint 1, 0x97f634b1 in malloc_error_break ()
(gdb) bt
#0 0x97f634b1 in malloc_error_break ()
#1 0x97f5e49f in szone_error ()
#2 0x97e8b876 in allocate_pages ()
#3 0x97e8c062 in large_and_huge_malloc ()
#4 0x97e831c8 in szone_malloc ()
#5 0x97e82fb8 in malloc_zone_malloc ()
#6 0x97e8c7b2 in realloc ()
#7 0x00128abe in xrealloc (ptr=0x0, size=2147483649) at wrapper.c:100
#8 0x00111a8c in strbuf_grow (sb=0xbfffe634, extra=2147483648) at
strbuf.c:74
#9 0x00112bb9 in strbuf_read (sb=0xbfffe634, fd=6, hint=2548572518)
at strbuf.c:349
#10 0x0009b899 in apply_filter (path=<value temporarily unavailable,
due to optimizations>, src=0x1000000 ' ' <repeats 200 times>...,
len=2147483648, dst=0xbfffe774, cmd=0x402980 "cat") at convert.c:407
#11 0x0009c6f6 in convert_to_git (path=0x4028b4 "2GB", src=0x1000000 '
' <repeats 200 times>..., len=2147483648, dst=0xbfffe774,
checksafe=SAFE_CRLF_WARN) at convert.c:764
#12 0x0010bb38 in index_mem (sha1=0x402330 "", buf=0x1000000,
size=2147483648, type=OBJ_BLOB, path=0x4028b4 "2GB", flags=1) at
sha1_file.c:3044
#13 0x0010bf57 in index_core [inlined] () at /private/var/tmp/src/git/
sha1_file.c:3101
#14 0x0010bf57 in index_fd (sha1=0x402330 "", fd=5, st=0xbfffe900,
type=OBJ_BLOB, path=0x4028b4 "2GB", flags=1) at sha1_file.c:3139
#15 0x0010c05e in index_path (sha1=0x402330 "", path=0x4028b4 "2GB",
st=0xbfffe900, flags=1) at sha1_file.c:3157
#16 0x000e82f4 in add_to_index (istate=0x1a8820, path=0x4028b4 "2GB",
st=0xbfffe900, flags=0) at read-cache.c:665
#17 0x000e87c8 in add_file_to_index (istate=0x1a8820, path=0x4028b4
"2GB", flags=0) at read-cache.c:694
#18 0x0000440a in cmd_add (argc=<value temporarily unavailable, due to
optimizations>, argv=0xbffff584, prefix=0x0) at builtin/add.c:299
#19 0x00002e1f in run_builtin [inlined] () at /private/var/tmp/src/git/
git.c:303
#20 0x00002e1f in handle_internal_command (argc=2, argv=0xbffff584) at
git.c:466
#21 0x000032d4 in run_argv [inlined] () at /private/var/tmp/src/git/
git.c:512
#22 0x000032d4 in main (argc=2, av=0xbfffe28c) at git.c:595
The size 2147487744 is 2GB + 4096 bytes. Apparently git does not
support a filter for a file unless the file can fit entirely into
git's memory space. Normally a single 2GB + 4096 byte allocation
works in an OS X 32-bit process, but something else is apparently
eating up a large portion of the memory space in this case (perhaps an
mmap'd copy?). In any case, if the file being filtered was closer to
4GB in size it would always fail on 32-bit regardless.
The fact that the entire file is read into memory when applying the
filter does not seem like a good thing (see #7-#10 above).
--Kyle
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html