Re: [PATCH] sparse-index: copy dir_hash in ensure_full_index()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/23/2021 2:18 PM, Jeff Hostetler wrote:
> On 8/23/21 10:25 AM, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Mon, Aug 16 2021, Derrick Stolee via GitGitGadget wrote:
>>
>>> From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
>>>
>>> Copy the 'index_state->dir_hash' back to the real istate after expanding
>>> a sparse index.
>>>
>>> A crash was observed in 'git status' during some hashmap lookups with
>>> corrupted hashmap entries.  During an index expansion, new cache-entries
>>> are added to the 'index_state->name_hash' and the 'dir_hash' in a
>>> temporary 'index_state' variable 'full'.  However, only the 'name_hash'
>>> hashmap from this temp variable was copied back into the real 'istate'
>>> variable.  The original copy of the 'dir_hash' was incorrectly
>>> preserved.  If the table in the 'full->dir_hash' hashmap were realloced,
>>> the stale version (in 'istate') would be corrupted.
>>>
>>> The test suite does not operate on index sizes sufficiently large to
>>> trigger this reallocation, so they do not cover this behavior.
>>> Increasing the test suite to cover such scale is fragile and likely
>>> wasteful.
>>
>> How large does the index need to be to trigger this? I don't know if a
>> test here is useful, but FWIW if we had such a test then the EXPENSIVE
>> prereq + GIT_TEST_LONG=true might be a good fit for it.
>>
> 
> There would need to be enough directories in the repo to cause
> the dir_hash to grow during an insert into the dir_hash.
> IIRC the hashmap table is initialized to 64 and auto reallocs when
> it hits 80% capacity, so somewhere in the area of 50 directories
> at minimum.

Further complicating this is that the sparse index size matters
relative to the full index size, since the dir_hash has a given
size going into ensure_full_index() that depends on the sparse-index
size and the reallocation patterns.
 
> Whether the error is observed would also depend on free() trashing
> the contents of the memory and/or whether the memory was recycled
> by something else.

Issues like this cause this to be hard to reproduce. We did not
catch it in the performance test p2000-sparse-operations.sh, which
aims to measure how the index works at scale.

This makes me incredibly pessimistic that such a test will be
effective at preventing a regression, let alone catching similar
issues in the future.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux