From: Andrzej Hunt <ajrhunt@xxxxxxxxxx> ibuf can be reused for multiple iterations of the loop. Specifically: deflate() overwrites s.avail_in to show how much of the input buffer has not been processed yet - and sometimes leaves 'avail_in > 0', in which case ibuf will be processed again during the loop's subsequent iteration. But if we declare ibuf within the loop, then (in theory) we get a new (and uninitialised) buffer for every iteration. In practice, my compiler seems to resue the same buffer - meaning that this code does work - but it doesn't seem safe to rely on this behaviour. MSAN correctly catches this issue - as soon as we hit the 's.avail_in > 0' condition, we end up reading from what seems to be uninitialised memory. Therefore, we move ibuf out of the loop, making this reuse safe. See MSAN output from t1050-large below - the interesting part is the ibuf creation at the end, although there's a lot of indirection before we reach the read from unitialised memory: ==11294==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x7f75db58fb1c in crc32_little crc32.c:283:9 #1 0x7f75db58d5b3 in crc32_z crc32.c:220:20 #2 0x7f75db59668c in crc32 crc32.c:242:12 #3 0x8c94f8 in hashwrite csum-file.c:101:15 #4 0x825faf in stream_to_pack bulk-checkin.c:154:5 #5 0x82467b in deflate_to_pack bulk-checkin.c:225:8 #6 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15 #7 0xa7cff2 in index_stream object-file.c:2234:9 #8 0xa7bff7 in index_fd object-file.c:2256:9 #9 0xa7d22d in index_path object-file.c:2274:7 #10 0xb3c8c9 in add_to_index read-cache.c:802:7 #11 0xb3e039 in add_file_to_index read-cache.c:835:9 #12 0x4a99c3 in add_files add.c:458:7 #13 0x4a7276 in cmd_add add.c:670:18 #14 0x4a1e76 in run_builtin git.c:461:11 #15 0x49e1e7 in handle_builtin git.c:714:3 #16 0x4a0c08 in run_argv git.c:781:4 #17 0x49d5a8 in cmd_main git.c:912:19 #18 0x7974da in main common-main.c:52:11 #19 0x7f75da66f349 in __libc_start_main (/lib64/libc.so.6+0x24349) #20 0x421bd9 in _start start.S:120 Uninitialized value was stored to memory at #0 0x7f75db58fa6b in crc32_little crc32.c:283:9 #1 0x7f75db58d5b3 in crc32_z crc32.c:220:20 #2 0x7f75db59668c in crc32 crc32.c:242:12 #3 0x8c94f8 in hashwrite csum-file.c:101:15 #4 0x825faf in stream_to_pack bulk-checkin.c:154:5 #5 0x82467b in deflate_to_pack bulk-checkin.c:225:8 #6 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15 #7 0xa7cff2 in index_stream object-file.c:2234:9 #8 0xa7bff7 in index_fd object-file.c:2256:9 #9 0xa7d22d in index_path object-file.c:2274:7 #10 0xb3c8c9 in add_to_index read-cache.c:802:7 #11 0xb3e039 in add_file_to_index read-cache.c:835:9 #12 0x4a99c3 in add_files add.c:458:7 #13 0x4a7276 in cmd_add add.c:670:18 #14 0x4a1e76 in run_builtin git.c:461:11 #15 0x49e1e7 in handle_builtin git.c:714:3 #16 0x4a0c08 in run_argv git.c:781:4 #17 0x49d5a8 in cmd_main git.c:912:19 #18 0x7974da in main common-main.c:52:11 #19 0x7f75da66f349 in __libc_start_main (/lib64/libc.so.6+0x24349) Uninitialized value was stored to memory at #0 0x447eb9 in __msan_memcpy msan_interceptors.cpp:1558:3 #1 0x7f75db5c2011 in flush_pending deflate.c:746:5 #2 0x7f75db5cafa0 in deflate_stored deflate.c:1815:9 #3 0x7f75db5bb7d2 in deflate deflate.c:1005:34 #4 0xd80b7f in git_deflate zlib.c:244:12 #5 0x825dff in stream_to_pack bulk-checkin.c:140:12 #6 0x82467b in deflate_to_pack bulk-checkin.c:225:8 #7 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15 #8 0xa7cff2 in index_stream object-file.c:2234:9 #9 0xa7bff7 in index_fd object-file.c:2256:9 #10 0xa7d22d in index_path object-file.c:2274:7 #11 0xb3c8c9 in add_to_index read-cache.c:802:7 #12 0xb3e039 in add_file_to_index read-cache.c:835:9 #13 0x4a99c3 in add_files add.c:458:7 #14 0x4a7276 in cmd_add add.c:670:18 #15 0x4a1e76 in run_builtin git.c:461:11 #16 0x49e1e7 in handle_builtin git.c:714:3 #17 0x4a0c08 in run_argv git.c:781:4 #18 0x49d5a8 in cmd_main git.c:912:19 #19 0x7974da in main common-main.c:52:11 Uninitialized value was stored to memory at #0 0x447eb9 in __msan_memcpy msan_interceptors.cpp:1558:3 #1 0x7f75db644241 in _tr_stored_block trees.c:873:5 #2 0x7f75db5cad7c in deflate_stored deflate.c:1813:9 #3 0x7f75db5bb7d2 in deflate deflate.c:1005:34 #4 0xd80b7f in git_deflate zlib.c:244:12 #5 0x825dff in stream_to_pack bulk-checkin.c:140:12 #6 0x82467b in deflate_to_pack bulk-checkin.c:225:8 #7 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15 #8 0xa7cff2 in index_stream object-file.c:2234:9 #9 0xa7bff7 in index_fd object-file.c:2256:9 #10 0xa7d22d in index_path object-file.c:2274:7 #11 0xb3c8c9 in add_to_index read-cache.c:802:7 #12 0xb3e039 in add_file_to_index read-cache.c:835:9 #13 0x4a99c3 in add_files add.c:458:7 #14 0x4a7276 in cmd_add add.c:670:18 #15 0x4a1e76 in run_builtin git.c:461:11 #16 0x49e1e7 in handle_builtin git.c:714:3 #17 0x4a0c08 in run_argv git.c:781:4 #18 0x49d5a8 in cmd_main git.c:912:19 #19 0x7974da in main common-main.c:52:11 Uninitialized value was stored to memory at #0 0x447eb9 in __msan_memcpy msan_interceptors.cpp:1558:3 #1 0x7f75db5c8fcf in deflate_stored deflate.c:1783:9 #2 0x7f75db5bb7d2 in deflate deflate.c:1005:34 #3 0xd80b7f in git_deflate zlib.c:244:12 #4 0x825dff in stream_to_pack bulk-checkin.c:140:12 #5 0x82467b in deflate_to_pack bulk-checkin.c:225:8 #6 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15 #7 0xa7cff2 in index_stream object-file.c:2234:9 #8 0xa7bff7 in index_fd object-file.c:2256:9 #9 0xa7d22d in index_path object-file.c:2274:7 #10 0xb3c8c9 in add_to_index read-cache.c:802:7 #11 0xb3e039 in add_file_to_index read-cache.c:835:9 #12 0x4a99c3 in add_files add.c:458:7 #13 0x4a7276 in cmd_add add.c:670:18 #14 0x4a1e76 in run_builtin git.c:461:11 #15 0x49e1e7 in handle_builtin git.c:714:3 #16 0x4a0c08 in run_argv git.c:781:4 #17 0x49d5a8 in cmd_main git.c:912:19 #18 0x7974da in main common-main.c:52:11 #19 0x7f75da66f349 in __libc_start_main (/lib64/libc.so.6+0x24349) Uninitialized value was stored to memory at #0 0x447eb9 in __msan_memcpy msan_interceptors.cpp:1558:3 #1 0x7f75db5ea545 in read_buf deflate.c:1181:5 #2 0x7f75db5c97f7 in deflate_stored deflate.c:1791:9 #3 0x7f75db5bb7d2 in deflate deflate.c:1005:34 #4 0xd80b7f in git_deflate zlib.c:244:12 #5 0x825dff in stream_to_pack bulk-checkin.c:140:12 #6 0x82467b in deflate_to_pack bulk-checkin.c:225:8 #7 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15 #8 0xa7cff2 in index_stream object-file.c:2234:9 #9 0xa7bff7 in index_fd object-file.c:2256:9 #10 0xa7d22d in index_path object-file.c:2274:7 #11 0xb3c8c9 in add_to_index read-cache.c:802:7 #12 0xb3e039 in add_file_to_index read-cache.c:835:9 #13 0x4a99c3 in add_files add.c:458:7 #14 0x4a7276 in cmd_add add.c:670:18 #15 0x4a1e76 in run_builtin git.c:461:11 #16 0x49e1e7 in handle_builtin git.c:714:3 #17 0x4a0c08 in run_argv git.c:781:4 #18 0x49d5a8 in cmd_main git.c:912:19 #19 0x7974da in main common-main.c:52:11 Uninitialized value was created by an allocation of 'ibuf' in the stack frame of function 'stream_to_pack' #0 0x825710 in stream_to_pack bulk-checkin.c:101 SUMMARY: MemorySanitizer: use-of-uninitialized-value crc32.c:283:9 in crc32_little Exiting Signed-off-by: Andrzej Hunt <andrzej@xxxxxxxxx> --- bulk-checkin.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/bulk-checkin.c b/bulk-checkin.c index 127312acd1ed..b023d9959aae 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -100,6 +100,7 @@ static int stream_to_pack(struct bulk_checkin_state *state, const char *path, unsigned flags) { git_zstream s; + unsigned char ibuf[16384]; unsigned char obuf[16384]; unsigned hdrlen; int status = Z_OK; @@ -113,8 +114,6 @@ static int stream_to_pack(struct bulk_checkin_state *state, s.avail_out = sizeof(obuf) - hdrlen; while (status != Z_STREAM_END) { - unsigned char ibuf[16384]; - if (size && !s.avail_in) { ssize_t rsize = size < sizeof(ibuf) ? size : sizeof(ibuf); ssize_t read_result = read_in_full(fd, ibuf, rsize); -- gitgitgadget