Re: Questions on GSoC 2019 Ideas

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Thu, Apr 4, 2019 at 3:15 AM Matheus Tavares Bernardino
<matheus.bernardino@xxxxxx> wrote:
>
> I've been studying the codebase and looking for older emails in the ML
> that discussed what I want to propose as my GSoC project. In
> particular, I found a thread about slow git commands on chromium, so I
> reached them out at chromium's ML to ask if it's still an issue. I got
> the following answer:
>
> On Wed, Apr 3, 2019 at 1:41 PM Erik Chen <erikchen@xxxxxxxxxxxx> wrote:
> > Yes, this is absolutely still a problem for Chrome. I filed some bugs for common operations that are slow for Chrome: git blame [1], git stash [2], git status [3]
> > On Linux, blame is the only operation that is really problematic. On macOS and Windows ... it's hard to find a git operation that isn't slow. :(

Nice investigation. About git status I wonder though if they have
tried the possible optimizations, like untracked cache or
core.fsmonitor.

> I don't really know if treading would help stash and status, but I
> think it could help blame. By the little I've read of blame's code so
> far, my guess is that the priority queue used for the commits could be
> an interface for a producer-consumer mechanism and that way,
> assign_blame's main loop could be done in parallel. And as we can se
> at [4], that is 90% of the command's time. Does this makes sense?

I can't really tell as I haven't studied this, but from the links in
your email I think it kind of makes sense.

Instead of doing assign_blame()'s main loop in parallel though, if my
focus was only making git blame faster, I think I would first try to
cache xdl_hash_record() results and then if possible to compute
xdl_hash_record() in parallel as it seems to be a big bottleneck and a
quite low hanging fruit.

> But as Duy pointed out, if I recall correctly, for git blame to be
> parallel, pack access and diff code would have to be thread-safe
> first. And also, it seems, by what we've talked earlier, that this
> much wouldn't fit all together in a single GSoC. So, would it be a
> nice GSoC proposal to try "making code used by blame thread-safe",
> targeting a future parallelism on blame to be done after GSoC?

Yeah, I think it would be a nice proposal, even though it doesn't seem
to be the most straightforward way to make git blame faster.

Back in 2008 when we proposed a GSoC about creating a sequencer, it
wasn't something that would easily fit in a GSoC, and in fact it
didn't, but over the long run it has been very fruitful as the
sequencer is now used by cherry-pick and rebase -i, and there are
plans to use it even more. So unless people think it's not a good idea
for some reason, which hasn't been the case yet, I am ok with a GSoC
project like this.

> And if
> so, could you please point me out which files should I be studying to
> write the planning for this proposal? (Unfortunately I wasn't able to
> study pack access and diff code yet. I got carried on looking for
> performance hostposts and now I'm a bit behind schedule :(

I don't think you need to study everything yet, and I think you
already did a lot of studying, so I would suggest you first try to
send soon a proposal with the information you have right now, and then
depending on the feedback you get and the time left (likely not
much!!!), you might study some parts of the code a bit more later.

> Also, an implementation for fuzzy blame is being developer right
> now[5] and Jeff (CC-ed) suggested recently another performance
> improvement that could be done in blame[6]. So I would like to know
> wether you think it is worthy putting efforts trying to parallelize
> it.

What you would do seems compatible to me with the fuzzy blame effort
and an effort to cache xdl_hash_record() results.

Thanks,
Christian.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux