On Thu, Jul 10, 2008 at 11:54 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > "Geoffrey Irving" <irving@xxxxxxx> writes: > >>>> Oops: avoiding the infinite loop only requires reading expected O(1) >>>> entries on load, so I can fix that if you like. It would only be all of >>>> them if it actually did detect the infinite loop. >>> >>> I have to admit that you lost me there. AFAIR the patch-id cache is a >>> simple commit->patch_id store, right? Then there should be no way to get >>> an infinite loop. >> >> If every entry is nonnull, find_helper loops forever. > > Isn't it sufficient to make this part check the condition as well? > > + if (cache->count >= cache->size) > + { > + warning("%s is corrupt: count %"PRIu32" >= size %"PRIu32, > + filename, cache->count, cache->size); > + goto empty; > + } > > At runtime you keep the invariants that hashtable always has at most 3/4 > full and whoever wrote the file you are reading must have honored that as > well, or there is something fishy going on. Good point. There's no reason not to check the 3/4 condition. It isn't sufficient to avoid the infinite loop, though, since we don't verify that count is accurate. Another route would to eliminate the count field entirely, and replace the count >= size/4*3 check with a statistical one based on the entries seen so far. The main advantage of that would be to make incremental writes simpler by avoiding the need to update the header. The disadvantage is that there would be a small chance that the map would grow in size despite being almost empty. Thoughts on whether we should do that? >>> Besides, this is a purely local cache, no? Never to be transmitted... So >>> not much chance of a malicious attack, except if you allow write access to >>> your local repository, in which case you are endangered no matter what. >> >> Yep, that's why it's only a hole in quotes, and why I didn't fix it. > > You might want to protect yourself against file corruption, though. > Checksumming the whole file and checking it at opening defeats the point > of mmap'ed access, but at least the header may want to be checksummed? Okay. I imagine I should use sha1 as the sum? Geoffrey -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html