Re: [PATCH 0/9] [RFC] Changed Paths Bloom Filters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/31/2019 11:45 AM, Jakub Narebski wrote:
> "Garima Singh via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
>> Performance Gains: We tested the performance of 'git log -- <path>' on the git
>> repo, the linux repo and some internal large repos, with a variety of paths
>> of varying depths.
>>
>> On the git and linux repos: We observed a 2x to 5x speed up.
>>
>> On a large internal repo with files seated 6-10 levels deep in the tree: We
>> observed 10x to 20x speed ups, with some paths going up to 28 times faster.
> 
> Could you provide some more statistics about this internal repository,
> such as number of files, number of commits, perhaps also number of all
> objects?  Thanks in advance.
> 
> I wonder why such large difference in performance 2-5x vs 10-20x.  Is it
> about the depth of the file hierarchy?  How would the numbers look for
> files seated closer to the root in the same large repository, like 3-5
> levels deep in the tree?

The internal repository we saw these massive gains on has:
- 413579 commits. 
- 183303 files distributed across 34482 folders
The size on disk is about 17 GiB. 

And yes, the difference is performance gains is mostly because of how 
deep the files were in the hierarchy. How often a file has been touched
also makes a difference. The performance gains are less dramatic if the 
file has a very sparse history even if it is a deep file. 

The numbers from the git and linux repos for instance, are for files 
closer to the root, hence 2x to 5x. 

Thanks! 
Garima Singh



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux