Re: [PATCH 04/12] bloom: clear each bloom_key after use

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 11/04/2021 09:26, SZEDER Gábor wrote:
On Fri, Apr 09, 2021 at 06:47:23PM +0000, Andrzej Hunt via GitGitGadget wrote:
From: Andrzej Hunt <ajrhunt@xxxxxxxxxx>

fill_bloom_key() allocates memory into bloom_key, we need to clean that
up once the key is no longer needed.

This fixes the following leak which was found while running t0002-t0099.
Although this leak is happening in code being called from a test-helper,
the same code is also used in various locations around git, and could
presumably happen during normal usage too.

It does indeed happen: 'git commit-graph write --reachable
--changed-paths' generates Bloom filters for every commit, with each
filter containing all paths modified by its associated commit, so it
leaks a lot of 7 * 4byte hashes.  This patch reduces the memory usage
of that command:

                          Max RSS
                     before      after
   ---------------------------------------------
   android-base     1275028k   1006576k   -21.1%
   chromium         3245144k   3127764k    -3.6%
   cmssw             793996k    699156k   -12.0%
   cpython           371584k    343480k    -7.6%
   elasticsearch     748104k    637936k   -14.7%
   freebsd-src       819020k    741272k    -9.5%
   gcc               867412k    730332k   -15.8%
   gecko-dev        2619112k   2457280k    -6.2%
   git               252684k    216900k   -14.2%
   glibc             239000k    222228k    -7.0%
   go                264132k    251344k    -4.9%
   homebrew-cask     542188k    480588k   -11.4%
   homebrew-core     805332k    715848k   -11.1%
   jdk               417832k    342928k   -17.9%
   libreoff-core    1257296k   1089980k   -13.3%
   linux            2033296k   1759712k   -13.5%
   llvm-project     1067216k    956704k   -10.4%
   mariadb-srv       695172k    559508k   -19.5%
   postgres          340132k    317416k    -6.7%
   rails             325432k    294332k    -9.6%
   rust              655244k    584904k   -10.7%
   tensorflow        507308k    480848k    -5.2%
   webkit           2466812k   2237332k    -9.3%

Just out of curiosity, I disabled the questionable hardcoded 512 paths
limit on the size of modified path Bloom filters, and the memory usage
in the jdk repository sunk by over 55%, from 849520k to 379760k.

Please feel free to include any of the above data points in the commit
message.

Thank you for the detailed analysis - these kinds of results are very motivating! I will include a brief summary (something like "10% typical improvement for 'commit-graph write' for large repos") along with a link to your posting for those who want the full picture.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux