On Fri, Aug 16, 2013 at 11:01:38AM -0400, Jeff King wrote: > That makes me inclined to teach index-pack to reject duplicate objects > in a single pack in order to prevent denial-of-service attacks. We can > potentially make them work in all code paths, but given that nobody > should be doing this legitimately, rejecting the duplicates outright > keeps our attack surface small, and nobody but attackers or users of > broken implementations should care. Here's a patch series in that direction: [1/4]: sha1-lookup: handle duplicate keys with GIT_USE_LOOKUP This reproduces and fixes the sha1-lookup bug. We should do this no matter what else we do. [2/4]: index-pack: optionally reject packs with duplicate objects This adds a pack.indexDuplicates option so that sites receiving packfiles from random folks on the internet can protect themselves from the potential denial-of-service mentioned above. The default remains to allow it. [3/4]: reject duplicates when indexing a packfile we created This is a safety check for packs we generate. Optional, but I think it's probably a good idea (and doesn't cost very much). [4/4]: index-pack: optionally skip duplicate packfile entries I really wanted to have a "fix" mode where we could take in packs with duplicate entries and just use them as-is. It's not correct to throw away the duplicates (they may be bases in a delta cycle), but we could maybe get by with simply not referencing them in the pack index. Unfortunately, the pack reader does not like the index we generate; see the patch for details and possible solutions (all of which are ugly). And it only helps in a delta cycle when delta base offsets are used. I had hoped to have a 5/4 flipping the default to "skip", since it would potentially fix the infinite loop problem and wouldn't have a downside. But since it doesn't work (and cannot fix the REF_DELTA case), it seems like a bad idea. Which leaves the open question: should the default for index-pack flip to reject duplicates rather than allow? It seems like it would be worth it to identify buggy packfiles before they move between repos. And in an emergency, we have the config flag to be more permissive in case somebody really needs to move the data via git. Thoughts? -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html