Re: [GSoC] Designing a faster index format

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 19, 2012 at 5:25 PM, Thomas Gummerer <t.gummerer@xxxxxxxxx> wrote:
> As Thomas Rast pointed out on IRC today, it would be better to first
>  implement a simple converter of the old index to the new format,
> and then implement a faster reader for the midterm, so we have a
> "future-proof" index format at the midterm and a fast reader.

You can skip the old-index reading code in the converter by writing a
git test command and  reuse git code. It's really simple: modify
Makefile, add your command name, say test-index-converter, to
TEST_PROGRAMS_NEED_X, then create test-index-converter.c with this:

#include "cache"

int main(int ac, char **av)
{
    setup_git_directory();
    read_cache();
    for (i = 0; i < cache_nr; i++) {
        struct cache_entry *ce = active_cache[i];
        /* process the entry and produce new format here */
    }
    return 0;
}

"make test-index-converter" will give you the command.

Of course it also comes at a cost because C may be slower to write
than Python for new-index constructing code. Or just output something
easy to parse in Python and write the Python converter from there.
--
Duy - Python ignorant

>  The new
>  writing algorithm will then be implemented in the second part of the
> Summer of Code.
>
> I'll just post the update on the timeline and the midterm evaluation here.
>
> -- Timeline --
> 24/04 - 01/05: Document the new index format.
> 02/05 - 11/05: Create a converter of the old index format to the new format in
> python.
> 12/05 - 18/06: Parse the index from disk to the current in-memory format. The
> old index format shall still be readable.
> 19/06 - 09/07: Implement the re-reading of a single record, if the crc32 doesn't
> match (Meaning the record has been changed under the reader).
> - Midterm evaluation -
> 10/07 - 21/07:  Map the current internal structure to the new index format.
> 22/07 - 31/07: Change the current in-memory structure to keep track of the
> changed files.
> 01/08 - 13/08: Write the index to disk in both the old and the new format
> depending on the choice of the user and make sure only the changed parts are
> really written to disk in the new format.
> 11/08 - 13/08: Test the new index and profile the gains compared to the old
> format.
> /* Development work will be a bit slower from 18/06 to 21/07 because at my
>  * University there are exams in this period. I probably will only be able to
>  * work half the hours. I'll be back up to full speed after that. */
>
> -- Midterm evaluation --
> At the midterm, there will be a python prototype for the conversion of the old
> index format to the new "future-proof" index format and a faster reader for the
> new format. The native write part will be completed in the second part of the
> Summer of Code.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]