On Mon, Aug 19, 2024 at 12:49:52PM +0200, Patrick Steinhardt wrote: > On Sat, Aug 17, 2024 at 08:14:24AM -0400, Jeff King wrote: > > On Fri, Aug 16, 2024 at 12:45:17PM +0200, Patrick Steinhardt wrote: > > > > > Fix this bug by asking git-gc(1) to not detach when it is being invoked > > > via git-maintenance(1). Instead, git-maintenance(1) now respects a new > > > config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and > > > detaches itself into the background when running as part of our auto > > > maintenance. This should continue to behave the same for all users which > > > use the git-gc(1) task, only. For others though, it means that we now > > > properly perform all tasks in the background. The default behaviour of > > > git-maintenance(1) when executed by the user does not change, it will > > > remain in the foreground unless they pass the `--detach` option. > > > > This patch seems to cause segfaults in t5616 when combined with the > > reftable backend. Try this: > > > > GIT_TEST_DEFAULT_REF_FORMAT=reftable ./t5616-partial-clone.sh --run=1-16 --stress > > > > which fails for me within a few runs. Bisecting leads to 98077d06b2 > > (run-command: fix detaching when running auto maintenance, 2024-08-16). > > It doesn't trigger with the files ref backend. > > > > Compiling with ASan gets me a stack trace like this: > > > > + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin > > AddressSanitizer:DEADLYSIGNAL > > ================================================================= > > ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0) > > ==657994==The signal is caused by a READ memory access. > > #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29 > > #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170 > > #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194 > > #3 0x55f23e54e72e in block_iter_next reftable/block.c:398 > > #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240 > > #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355 > > #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339 > > #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69 > > #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123 > > #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172 > > #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175 > > #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464 > > #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13 > > #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452 > > #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623 > > #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659 > > #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133 > > #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432 > > #18 0x55f23dba7764 in run_builtin git.c:484 > > #19 0x55f23dba7764 in handle_builtin git.c:741 > > #20 0x55f23dbab61e in run_argv git.c:805 > > #21 0x55f23dbab61e in cmd_main git.c:1000 > > #22 0x55f23dba4781 in main common-main.c:64 > > #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360 > > #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27) > > I haven't yet been able to definitely tell, but I think this is a > lifetime issue. We create an iterator, eventually notice that the > reftable stack has been rewritten, and reload the stack. But the > block sources used for the old tables are still referenced by the > iterator, even though it was closed. As such, the mmapped memory of the > table has been unmapped and is now invalid, which causes the above > invalid reads. > > I'll work on a patch series that introduces refcounting for block > sources, but guess that'll take a bit. This is being handled via https://lore.kernel.org/git/cover.1724080006.git.ps@xxxxxx/. Patrick