Re: [GSoC] Designing a faster index format - Progress report week 13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas Gummerer <t.gummerer@xxxxxxxxx> writes:

> == Work done in the previous 12 weeks ==
>
> - Definition of a tentative index file v5 format [1]. This differs
>   from the proposal in making it possible to bisect the directory
>   entries and file entries, to do a binary search. The exact bits
>   for each section were also defined. To further compress the index,
>   along with prefix compression, the stat data is hashed, since
>   it's only used for comparison, but the plain data is never used.

s/comparison/equality comparison/ perhaps?

>   Thanks to Michael Haggerty, Nguyen Thai Ngoc Duy, Thomas Rast
>   and Robin Rosenberg for feedback.

> - Read the index format format and translate it to the current in

s/format format/on-disk file format/ or something?

>   memory format. This doesn't include reading any of the current
>   extensions, which are now part of the main index. The code again
>   is on github. [4] Thanks for reviewing the first steps to Thomas
>   Rast.

> - Started implementing the writer, which extracts the directories from
>   the in-memory format, and writes the header and the directories to
>   disk.
> - I found a few bugs in the algorithm for extracting the directories
>   and decided to completely rewrite it, using a hash table instead of
>   simple lists, since the old one would have to many corner cases to
>   handle.

What does "the algorithm" refer to?  Is it the one described in the
previous bullet point, or is it the code in production?  If latter,
it would help to separate out the task to fix the breakage, as
people with the current or previous versions of Git will be
negatively affected until that bug is fixed.  If former, I am not
sure if this task needs to be described in two bullet points ("I did
X, X had bug so I redid X in a different way" is still a single task
to do X).

> == Work done int the last week ==
>
> - Polished the patch for the ce_namelen field. The thread for the
>   patch can be found at [5].

Thanks for this one; I think it is ready for 'next', but if you are
still not satisfied I do not mind waiting for further perfection.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]