Re: [PATCH 09/16] documentation: add documentation for the bitmap format

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 27, 2013 at 09:07:38AM -0700, Shawn O. Pearce wrote:

> > And the pack-order versus idx-order for the bitmaps is still up in the
> > air. Do we have numbers on the on-disk sizes of the resulting EWAHs?
> 
> I did not see any presented in this thread, and I am very interested
> in this aspect of the series. The path hash cache should be taking
> about 9.9M of disk space, but I recall reading the bitmap file is 8M.
> I don't understand.

I don't know there the 8M number came from, or if it was on the kernel
repo. My bitmap-enabled pack of linux-2.6 (about 3.2M objects) using
Vicent's patches looks like:

  $ du -sh *
  42M     pack-9ea76831aec6c49c5ff42509a2a2ce97da13c5ad.bitmap
  87M     pack-9ea76831aec6c49c5ff42509a2a2ce97da13c5ad.idx
  630M    pack-9ea76831aec6c49c5ff42509a2a2ce97da13c5ad.pack

Packing the same repo with "jgit debug-gc" (jgit 3.0.0) yields:

  $ du -sh *
  3.0M    pack-2478783825733a1f1012f0087a0b5a92aa7437d8.bitmap
  82M     pack-2478783825733a1f1012f0087a0b5a92aa7437d8.idx
  585M    pack-2478783825733a1f1012f0087a0b5a92aa7437d8.pack
  4.8M    pack-f61fb76112372288923be7a0464476892dfebe3e.idx
  97M     pack-f61fb76112372288923be7a0464476892dfebe3e.pack

If we assume that 12M of that is name-hash, that's still an order of
magnitude larger. For reference, jgit created 327 bitmaps (according to
its progress eye candy), and Vicent's patches generated 385. So that
explains some of the increase, but the per-bitmap size is still much
larger.

> The path hash cache may still be required, Colby and I have been
> debating the merits of having the data available for delta compression
> vs. the increase in memory required to hold it.

I guess this is not an option for JGit, but for C git, an mmap-able
name-hash file means we can just fault in the pages mentioning objects
we actually need it for. And its use can be completely optional; in
fact, it doesn't even need to be inside the .bitmap file (though I
cannot think of a reason it would be useful outside of having bitmaps).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]