GSOC proposal: Parallel Git Commands

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Goals and Steps:

This project is supposed to improve the parallelization and
scalability of Git command. Though the current version of Git does
take parallel programming into consideration (such as git grep
command), there are a lot of aspects to improve. For instance, git
grep --cache and git diff haven’t adopted multi-thread programming
model. Therefore, I supposesome other commands shouldalso be
paralleled.

The project can be finished by the following steps:

1.  Week 1 ~ 2: Profile the Git commands in an extremely tough
environment, such as a very big repository or a long time lasting
project. This is used to distinguish which command takes much more
time to accomplish as the repository grows more complexity.

2.  Week 3 ~ 4: Profile the commands, which are selected from the
first step in more detail to tell what’s the bottleneck and whether it
can be fixed in parallel manner. The profile tools can be oprofile,
dtrace or some other proper tools.

3.  Week 5 ~ 12: Redesign the data structure and data process logic to
parallel the git commands.

4.  Week 5 ~ 12: Redo the profiling and testing to confirm that these
changes really improve the scalability and parallelization of Git
commands.

5.  Step 3 and Step 4 may loop several times to fix the errors and new
performance bottlenecks introduced by new design.

Success criteria:

    After the project, git should scale well as the CPU core number
grows or at least have a better performance than the current version.

Interfaces:

This project doesn’t deal with extending git, but redesigning the
internal data structure and process logic to parallel it. For
instance, there may be some data structures sharing between the loop
iterations which maybring difficulty to parallel it. So far, I have
checked the git grep and git diff commands and the related filesare
supposed tobe modified, such as diff.c and grep.c. However, as the
project goes on, more problems may be revealed and more files will
need to be changed.


About Me:

Jicheng Shi(施继成)



First Year Graduate Student

Parallel Processing Institute
Fudan University
RM 320, Software Building, 825 Zhangheng Road, Shanghai
P.R. China, 201203
Email: jcshi@xxxxxxxxxxxx

Education

Sep. 2011 - Now. Software School, Fudan University

Sep. 2007 - July. 2011. Software School, Fudan University

Publication

Xiang Song, Jicheng Shi, Haibo Chen and Binyu Zang. Revisiting
Software Zero-Copy forWeb-caching Applications with Twin Allocators.
Proceedings of 2012 Usenix Annual Technical Conference (Usenix ATC
2012, short paper, to appear). Boston, Massachusetts USA, June, 2012.

Jicheng Shi, Xiang Song, Haibo Chen, Binyu Zang. Limiting Cache-based
Side-Channel in Multi-tenant Cloud using Dynamic Page Coloring. the
7th Workshop on Hot Topics in System Dependability (HotDep'11)



Research Projects

Paralleling VM Migration.  Dec.2011 - Now

TCP/UDP Stack Buffer Zero-Copy.  May.2011 - Nov.2011

Defending Side Channel Attack in Cloud.  Nov.2009 - Apr.2011

Skills

Languages: C(4 years), Java(4 years), C++, Python (2 years), Common
Lisp, Javascript

System: Linux(Operation System), Xen(Virtualization System)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]