On Mon, Sep 25, 2017 at 05:28:43PM -0700, Sabelo Mhlambi wrote: > Hi Jeff (and the Git community), > > As my intro to open source contributions I'd like to attempt the "Speeding > up history traversals with caches" as outlined here > https://git.github.io/Outreachy-15/. > > It seems like a challenging and worthwhile problem. May I have more > information on the project and on how to get get started on the application. > > Thanks! Hi Sabelo, welcome to Git! Unfortunately your message didn't make it to the mailing list, because the list software is strict about messages not including any HTML parts. It looks like you're using Gmail; you'll need to ask it to send plain-text emails. The general idea of the project is: a lot of git commands need to access commit objects to walk the history graph, but they're expensive to access because we have to inflate the whole commit object from disk. What I'd like to have instead is a compact representation that we can quickly use to get the main interesting data out of a commit message without having to inflate all of the bytes. I did a prototype of this a few years ago: https://public-inbox.org/git/20130129091434.GA6975@xxxxxxxxxxxxxxxxxxxxx/ Compared to those patches, there are a lot of possible things to work on: - the code needs cleaned up and ported to a more modern git - the implementation is a bit complex; it was anticipating having several types of auxiliary files, but probably we really just need one - we've also discussed storing computed data about the graph, such as generation numbers, which can help speed up some traversals - we may be able to cache some interesting tree data (e.g., bitmaps of which paths are touched by a particular commit). I wouldn't expect us to cover all of that during the internship period, but it gives a sense of the possible directions. That thread may work as a starting point for understanding the problem space. You can also probably find some interesting discussions if you search for "generation number" in the mailing list archive at https://public-inbox.org/git. The first step is probably to get comfortable with building Git and submitting a small patch. Christian posted some advice on finding a topic to work on: https://public-inbox.org/git/CAP8UFD3vPQHJZNt1+egKkshiyqrGKiJp7eWU-Es6bTLgvXe1Kg@xxxxxxxxxxxxxx/ Let us know if you get stuck or if you have any questions! -Peff