Re: [PATCH 02/10] builtin/fast-import: fix segfault with unsafe SHA1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 30, 2024 at 03:24:02PM +0100, Patrick Steinhardt wrote:
> Same as with the preceding commit, git-fast-import(1) is using the safe
> variant to initialize a hashfile checkpoint. This leads to a segfault
> when passing the checkpoint into the hashfile subsystem because it would
> use the unsafe variants instead:
>
>     ++ git --git-dir=R/.git fast-import --big-file-threshold=1
>     AddressSanitizer:DEADLYSIGNAL
>     =================================================================
>     ==577126==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x5070000009c0 sp 0x7fffffff5b30 T0)
>     ==577126==The signal is caused by a READ memory access.
>     ==577126==Hint: address points to the zero page.
>         #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4)
>         #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2
>         #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2
>         #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2
>         #4 0x5555559647d1 in stream_blob ../builtin/fast-import.c:1110:2
>         #5 0x55555596247b in parse_and_store_blob ../builtin/fast-import.c:2031:3
>         #6 0x555555967f91 in file_change_m ../builtin/fast-import.c:2408:5
>         #7 0x55555595d8a2 in parse_new_commit ../builtin/fast-import.c:2768:4
>         #8 0x55555595bb7a in cmd_fast_import ../builtin/fast-import.c:3614:4
>         #9 0x555555b1f493 in run_builtin ../git.c:480:11
>         #10 0x555555b1bfef in handle_builtin ../git.c:740:9
>         #11 0x555555b1e6f4 in run_argv ../git.c:807:4
>         #12 0x555555b1b87a in cmd_main ../git.c:947:19
>         #13 0x5555561649e6 in main ../common-main.c:64:11
>         #14 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
>         #15 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
>         #16 0x555555772c84 in _start (git+0x21ec84)

Wow. This is a really good catch, and I'm embarrassed to have introduced
the bug in the first place.

If I can state back to you what I think is going on to make sure that
we're on the same page... In fast-import, we initialize a
hashfile_checkpoint struct using regular the_hash_algo callbacks, like
the_hash_algo->init_fn(), the_hash_algo->update_fn(), etc. But we also
have a hashfile, which will use the unsafe variants of these functions.

If we're using a specialized unsafe variant, then calling
hashfile_checkpoint() will crash. In your example, the crash happens
when we try and call the clone_fn, but I think it could also happen
with hashflush() when it tries to call update_fn().

I thought that my series here:

    https://lore.kernel.org/git/cover.1732130001.git.me@xxxxxxxxxxxx/

would have prevented this, but I don't think that it would have in its
current state. A little more on that in a second...

> ---
>  builtin/fast-import.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index 1fa2929a01b7dfee52b653248bba802884f6be6a..0f86392761abbe6acb217fef7f4fe7c3ff5ac1fa 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -1106,7 +1106,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
>  		|| (pack_size + PACK_SIZE_THRESHOLD + len) < pack_size)
>  		cycle_packfile();
>
> -	the_hash_algo->init_fn(&checkpoint.ctx);
> +	the_hash_algo->unsafe_init_fn(&checkpoint.ctx);

This will obviously fix the issue at hand, but I don't think this is any
less brittle than before. The hash function implementation here needs to
agree with that used in the hashfile API. This change makes that
happen, but only using side information that the hashfile API uses the
unsafe variants.

The series I mentioned above made the algop a property of the hashfile
itself, and introduced a specialized "unsafe" variant that did not
require using separate callbacks.

So that on its own would not have saved us here (or in the other spot
from the previous patch). But something on top like this would have:

--- 8< ---
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 76d5c20f14..89fc20d436 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -1101,7 +1101,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
 		|| (pack_size + PACK_SIZE_THRESHOLD + len) < pack_size)
 		cycle_packfile();

-	the_hash_algo->init_fn(&checkpoint.ctx);
+	pack_file->algop->init_fn(&checkpoint.ctx);
 	hashfile_checkpoint(pack_file, &checkpoint);
 	offset = checkpoint.offset;
--- >8 ---

(and an analogous change to the other spot).

I think we should perhaps combine forces here. My ideal end-state is to
have the unsafe_hash_algo() stuff land from my earlier series, then have
these two fixes (adjusted to the new world order as above), and finally
the Meson fixes after that.

Does that seem like a plan to you? If so, I can put everything together
and send it out (if you're OK with me forging your s-o-b).

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux