Re: [PATCH v2 1/8] packfile: prepare for the existence of '*.rev' files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 13, 2021 at 11:26:59PM -0800, Junio C Hamano wrote:
> Taylor Blau <me@xxxxxxxxxxxx> writes:
>
> > Specify the format of the on-disk reverse index 'pack-*.rev' file, as
> > well as prepare the code for the existence of such files.
>
> We've changed the pack .idx file format once as the file format is
> versioned.  I wonder if you considered placing the reverse index
> information in the same file, with version bump, in the .idx?

Funny enough, I couldn't remember whether I considered this or not.
Peff reminded me off-list that he and I *had* considered this, but
decided against it.

The main benefit to introducing a new '.rev' format is that we can have
packs that can be upgraded to write a reverse index without having to
rewrite their forward index. It would also allow us to avoid teaching
other implementations about a new version of the index file (since they
can ignore it and continue to build their equivalent of the reverse
index file in memory by reading the forward index).

(Peff reminds me that dumb-http does look at remote .idx files, so this
new format would leak across to clients, whether or not that's something
to be concerned about...).

Of course, having the contents of the .rev file be included in the .idx
file nets us one fewer file to manage, but I'm not sure that's a reason
to do things one way or another.

Your response did pique my interest, since I was wondering if we could
improve the cold cache performance if the .rev file's contents were
included in the .idx, but after giving it some thought I don't think we
can. Reasons are:

  - If the reverse index's contents appears at the end of the .idx file,
    then in any .idx file large enough to matter, we'll almost certainly
    still be evicting cache lines back and forth when swapping between
    reading the forward- and reverse-indexes. So, no gains to be had
    there.

  - If, on the other hand, we included the reverse index's contents by
    interleaving it with the forward index's offsets, then we'd be
    worsening the cache performance of the forward index.

So, I'm more in favor of a new .rev file rather than a v3 .idx version.
Apologies for not including more of a rationale "why" in the cover
letter (had I not forgotten that I'd even considered it, I would have).

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux