Re: [PATCH 02/10] builtin/fast-import: fix segfault with unsafe SHA1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 03, 2025 at 02:08:01PM +0100, Patrick Steinhardt wrote:
> On Mon, Dec 30, 2024 at 12:22:34PM -0500, Taylor Blau wrote:
> > On Mon, Dec 30, 2024 at 03:24:02PM +0100, Patrick Steinhardt wrote:
> > > diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> > > index 1fa2929a01b7dfee52b653248bba802884f6be6a..0f86392761abbe6acb217fef7f4fe7c3ff5ac1fa 100644
> > > --- a/builtin/fast-import.c
> > > +++ b/builtin/fast-import.c
> > > @@ -1106,7 +1106,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
> > >  		|| (pack_size + PACK_SIZE_THRESHOLD + len) < pack_size)
> > >  		cycle_packfile();
> > >
> > > -	the_hash_algo->init_fn(&checkpoint.ctx);
> > > +	the_hash_algo->unsafe_init_fn(&checkpoint.ctx);
> >
> > This will obviously fix the issue at hand, but I don't think this is any
> > less brittle than before. The hash function implementation here needs to
> > agree with that used in the hashfile API. This change makes that
> > happen, but only using side information that the hashfile API uses the
> > unsafe variants.
>
> Yup, I only cared about fixing the segfault because we're close to the
> v2.48 release. I agree that the overall state is still extremely brittle
> right now.
>
> [snip]
> > I think we should perhaps combine forces here. My ideal end-state is to
> > have the unsafe_hash_algo() stuff land from my earlier series, then have
> > these two fixes (adjusted to the new world order as above), and finally
> > the Meson fixes after that.
> >
> > Does that seem like a plan to you? If so, I can put everything together
> > and send it out (if you're OK with me forging your s-o-b).
>
> I think the ideal state would be if the hashing function used was stored
> as part of `struct git_hash_ctx`. So the flow basically becomes for
> example:
>
>     ```
>     struct git_hash_ctx ctx;
>     struct object_id oid;
>
>     git_hash_sha1_init(&ctx);
>     git_hash_update(&ctx, data);
>     git_hash_final_oid(&oid, &ctx);
>     ```
>
> Note how the intermediate calls don't need to know which hash function
> you used to initialize the `struct git_hash_ctx` -- the structure itself
> should remember what it has been initilized with and do the right thing.

I'm not sure I'm following you here. In the stream_blob() function
within fast-import, the problem isn't that we're switching hash
functions mid-stream, but that we're initializing the hashfile_context
structure with the wrong hash function to begin with.

You snipped it out of your reply, but I think that my suggestion to do:

    pack_file->algop->init_fn(&checkpoint.ctx);

would harden us against the broken behavior we're seeing here.

As a separate defense-in-depth measure, we could teach functions from
the hashfile API which deal with hashfile_checkpoint structure to ensure
that the hashfile and its checkpoint both use the same algorithm (by
adding a hash_algo field to the hashfile_checkpoint structure).

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux