Re: Google Summer of Code 2013 (GSoC13)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas Rast <trast@xxxxxxxxxxx> writes:

> * We should prepare an "ideas page"[...]
>     https://github.com/trast/git/wiki/SoC-2013-Ideas

>From where I'm currently sitting, I won't have the time to mentor this
year.  So my two earlier proposals are essentially up for grabs:

1. Improving parallelism in various commands
   -----------------------------------------
 
   Git is mostly written single-threaded, with a few commands having
   bolted-on extensions to support parallel operation (notably git-grep,
   git-pack-objects and the core.preloadIndex feature).
 
   We have recently looked into some of these areas and made a few
   optimizations, but a big roadblock is that pack access is entirely
   single-threaded.  The project would consist of the following steps:
 
    * In preparation (the half-step): identify commands that could
      benefit from parallelism.  `git grep --cached` and `git grep
      COMMIT` come to mind, but most likely also `git diff` and `git log
      -p`.  You can probably find more.
 
    * Rework the pack access mechanisms to allow the maximum possible
      parallel access.
 
    * Rework the commands found in the first step to use parallel pack
      access if possible.  Along the way, document the improvements with
      performance tests.
 
   The actual programming must be done in C using pthreads for obvious
   reasons.  At the very least you should not be scared of low-level
   programming.  Prior experience and access to one or more multi-core
   computers is a plus.

This one is probably still a contender.  However, it might be worth
first looking into whether using libgit2 for pack reading would be
easier and faster, since it is written to be reentrant from the ground
up.


2. Improving the `git add -p` interface
   ------------------------------------

   The interface behind `git {add|commit|stash|reset} {-p|-i}` is shared
   and called `git-add--interactive.perl`.    This project would mostly
   focus on the `--patch` side, as that seems to be much more widely
   used; however, improvements to `--interactive` would probably also be
   welcome.

   The `--patch` interface suffers from some design flaws caused largely
   by how the script grew:

    * Application is not atomic: hitting Ctrl-C midway through patching
      may still touch files.

    * The terminal/line-based interface becomes a problem if diff hunks
      are too long to fit in your terminal.

    * Cannot go back and forth between files.

    * Cannot reverse the direction of the patch.

    * Cannot look at the diff in word-diff mode (and apply it normally).

   Due to the current design it is also pretty hard to add these features
   without adding to the mess.  Thus the project consists of:

    * Come up with more ideas for features/improvements and discuss them
      with users.

    * Cleanly redesigning the main interface loop to allow for the above
      features.

    * Implement the new features.

   As the existing code is written in Perl, that is what you will use for
   this project.

This has already featured twice, and resulted in proposals that were
insufficiently advanced and too little work for a GSoC.  If nobody feels
like extending it to a bigger project, I'll just scrap it.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]