[RFC/PATCH 0/4] duplicate objects in packfiles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 16, 2013 at 11:01:38AM -0400, Jeff King wrote:

> That makes me inclined to teach index-pack to reject duplicate objects
> in a single pack in order to prevent denial-of-service attacks. We can
> potentially make them work in all code paths, but given that nobody
> should be doing this legitimately, rejecting the duplicates outright
> keeps our attack surface small, and nobody but attackers or users of
> broken implementations should care.

Here's a patch series in that direction:

  [1/4]: sha1-lookup: handle duplicate keys with GIT_USE_LOOKUP

This reproduces and fixes the sha1-lookup bug. We should do this no
matter what else we do.

  [2/4]: index-pack: optionally reject packs with duplicate objects

This adds a pack.indexDuplicates option so that sites receiving
packfiles from random folks on the internet can protect themselves from
the potential denial-of-service mentioned above. The default remains to
allow it.

  [3/4]: reject duplicates when indexing a packfile we created

This is a safety check for packs we generate. Optional, but I think it's
probably a good idea (and doesn't cost very much).

  [4/4]: index-pack: optionally skip duplicate packfile entries

I really wanted to have a "fix" mode where we could take in packs with
duplicate entries and just use them as-is. It's not correct to throw
away the duplicates (they may be bases in a delta cycle), but we could
maybe get by with simply not referencing them in the pack index.
Unfortunately, the pack reader does not like the index we generate; see
the patch for details and possible solutions (all of which are ugly).
And it only helps in a delta cycle when delta base offsets are used.

I had hoped to have a 5/4 flipping the default to "skip", since it would
potentially fix the infinite loop problem and wouldn't have a downside.
But since it doesn't work (and cannot fix the REF_DELTA case), it seems
like a bad idea.

Which leaves the open question: should the default for index-pack flip
to reject duplicates rather than allow? It seems like it would be worth
it to identify buggy packfiles before they move between repos. And in an
emergency, we have the config flag to be more permissive in case
somebody really needs to move the data via git.

Thoughts?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]