Re: [GSoC] Designing a faster index format - Progress report week 13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 07/16, Junio C Hamano wrote:
> Thomas Gummerer <t.gummerer@xxxxxxxxx> writes:
> 
> > == Work done in the previous 12 weeks ==
> >
> > - Definition of a tentative index file v5 format [1]. This differs
> >   from the proposal in making it possible to bisect the directory
> >   entries and file entries, to do a binary search. The exact bits
> >   for each section were also defined. To further compress the index,
> >   along with prefix compression, the stat data is hashed, since
> >   it's only used for comparison, but the plain data is never used.
> 
> s/comparison/equality comparison/ perhaps?
> 

Exactly, thanks.

> >   Thanks to Michael Haggerty, Nguyen Thai Ngoc Duy, Thomas Rast
> >   and Robin Rosenberg for feedback.
> 
> > - Read the index format format and translate it to the current in
> 
> s/format format/on-disk file format/ or something?
>

Yes, thanks.

> >   memory format. This doesn't include reading any of the current
> >   extensions, which are now part of the main index. The code again
> >   is on github. [4] Thanks for reviewing the first steps to Thomas
> >   Rast.
> 
> > - Started implementing the writer, which extracts the directories from
> >   the in-memory format, and writes the header and the directories to
> >   disk.
> > - I found a few bugs in the algorithm for extracting the directories
> >   and decided to completely rewrite it, using a hash table instead of
> >   simple lists, since the old one would have to many corner cases to
> >   handle.
> 
> What does "the algorithm" refer to?  Is it the one described in the
> previous bullet point, or is it the code in production?  If latter,
> it would help to separate out the task to fix the breakage, as
> people with the current or previous versions of Git will be
> negatively affected until that bug is fixed.  If former, I am not
> sure if this task needs to be described in two bullet points ("I did
> X, X had bug so I redid X in a different way" is still a single task
> to do X).

It refers to the algorithm in the previous bullet point, which
extracts the directories, and can be included in the above bullet
point. Sorry for the confusion.

> > == Work done int the last week ==
> >
> > - Polished the patch for the ce_namelen field. The thread for the
> >   patch can be found at [5].
> 
> Thanks for this one; I think it is ready for 'next', but if you are
> still not satisfied I do not mind waiting for further perfection.

Thanks, I'm satisfied with it, for me it can be merged to 'next'.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]