Re: [PATCH v3 3/9] finalize_object_file(): implement collision check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 16, 2024 at 11:54:14AM -0400, Taylor Blau wrote:
> On Mon, Sep 16, 2024 at 12:45:38PM +0200, Patrick Steinhardt wrote:
> > This function compares the exact contents, but isn't that wrong? The
> > contents may differ even though the object is the same because the
> > object hash is derived from the uncompressed data, whereas we store
> > compressed data on disk.
> >
> > So this might trigger when you have different zlib versions, but also if
> > you configure core.compression differently.
>
> Oh, shoot -- you're right for when we call finalize_object_file() on
> loose objects.
>
> For packfiles and other spots where we use the hashfile API, the
> name of the file depends on the checksum'd contents, so we're safe
> there. But for loose objects the, the name of the file is based on the
> object hash, which is the hash of the uncompressed contents.
>
> So I think we would need something like this on top:

Thinking about this a little more, I think that most cases should
actually be OK. Of course, this only affects repositories that are
changing their zlib version, and/or changing their core.compression
setting regularly, so this is all pretty niche, but still...

We often guard calls to write_loose_object() with either calling
freshen_loose_object(), or freshen_packed_object(), which will update
the mtime of the containing pack or loose object path for a given hash.

So if we already have a loose object with hash X in the object store,
and we try to write that same object again with a different zlib
version and/or core.compression setting, we'll simply call
freshen_loose_object() and optimize out the actual object write.

Of course, that's not always the case. We might e.g., force an object
loose via loosen_packed_objects() which could encounter a TOCTOU race
with an incoming object write using a different compression setting. But
that seems exceedingly rare to me.

I think that we should still take that change in the mail I'm replying
to, but I would definitely appreciate others' thoughts here.

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux