Re: [PATCH] cherry: cache patch-ids to avoid repeating work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 10, 2008 at 11:54 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> "Geoffrey Irving" <irving@xxxxxxx> writes:
>
>>>> Oops: avoiding the infinite loop only requires reading expected O(1)
>>>> entries on load, so I can fix that if you like.  It would only be all of
>>>> them if it actually did detect the infinite loop.
>>>
>>> I have to admit that you lost me there.  AFAIR the patch-id cache is a
>>> simple commit->patch_id store, right?  Then there should be no way to get
>>> an infinite loop.
>>
>> If every entry is nonnull, find_helper loops forever.
>
> Isn't it sufficient to make this part check the condition as well?
>
> +       if (cache->count >= cache->size)
> +       {
> +               warning("%s is corrupt: count %"PRIu32" >= size %"PRIu32,
> +                       filename, cache->count, cache->size);
> +               goto empty;
> +       }
>
> At runtime you keep the invariants that hashtable always has at most 3/4
> full and whoever wrote the file you are reading must have honored that as
> well, or there is something fishy going on.

Good point.  There's no reason not to check the 3/4 condition.  It
isn't sufficient to avoid the infinite loop, though, since we don't
verify that count is accurate.

Another route would to eliminate the count field entirely, and replace
the count >= size/4*3 check with a statistical one based on the
entries seen so far.  The main advantage of that would be to make
incremental writes simpler by avoiding the need to update the header.
The disadvantage is that there would be a small chance that the map
would grow in size despite being almost empty.  Thoughts on whether we
should do that?

>>> Besides, this is a purely local cache, no?  Never to be transmitted...  So
>>> not much chance of a malicious attack, except if you allow write access to
>>> your local repository, in which case you are endangered no matter what.
>>
>> Yep, that's why it's only a hole in quotes, and why I didn't fix it.
>
> You might want to protect yourself against file corruption, though.
> Checksumming the whole file and checking it at opening defeats the point
> of mmap'ed access, but at least the header may want to be checksummed?

Okay.  I imagine I should use sha1 as the sum?

Geoffrey
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux