Re: [PATCH 1/6] Documentation/technical: describe bitmap lookup table extension

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Taylor Blau <me@xxxxxxxxxxxx> wrote:

>     In cases where the result can be read or computed without
>     significant additional traversal (e.g., all commits of interest
>     already have bitmaps computed), we can save some time loading and
>     parsing a majority of the bitmap file that we will never read.
>
>     But in cases where the bitmaps are out-of-date, or there is
>     significant traversal required to go from the reference tips to
>     what's contained in the .bitmap file, this table provides minimal
>     benefit (or something).
>
> Of course, you should verify that that is actually true before we insert
> it into the commit message as such ;-). But that sort of information may
> help readers understand what the purpose of this change is towards the
> beinning of the series.

The performance tests cover tests for command like "git rev-list --count
--objects --all", "simulated clone", "simulated fetch" etc. And I tested
it with both the Git and Linux. In both cases, the average cost of
"Without lookup table" is bigger than "with lookup table". The margin of
difference is bigger for linux. Though, I need to fix the calculation
of xor-offset (see my reply to derrick), the fix will not affect the
performance too much. So, what you're saying is true. I think I didn't
write the bitmap out-of-date test though.

> Here and elsewhere: I typically use my <me@xxxxxxxxxxxx> address when
> contributing to Git. So any trailers that mention my email or commits
> that you send on my behalf should use that address, too.

Ohh, sorry! Will fix it.

> It the space between "(0xf)" and the first ":" intentional? Similarly,
> should there be two or three colons at the end (either "::" or ":::")?

Yes, it is intentional. My previous patch (formatting the bitmap-format.txt)
uses nested description lists. ":::" means it is the level 3 description list.
The space is required else asciidoc will assume that it is level 4 description
list.

> I remember we had a brief off-list discussion about whether we should
> store the full object IDs in the offset table, or whether we could store
> their pack- or index-relative ordering. Is there a reason to prefer one
> or the other?
>
> I don't think we need to explain the choice fully in the documentation
> in this patch, but it may be worth thinking about separately
> nonetheless. We can store either order and convert it to an object ID in
> constant time.
>
> To figure out which is best, I would recommend trying a few different
> choices here and seeing how they do or don't impact your performance
> testing.

I think at that time I thought it would add extra cost of computing
the actual commit ids from those index position. So, I didn't go 
further here.

I still have a feeling that there is some way to get rid of this
list of commit ids. But at the same time, I do not want to add
extra computation to the code.

> A couple of small thoughts here. I wonder if we'd get better locality if
> we made each record look something like:
>
>     (object_id, offset, xor_pos)
>
> Where object_id is either 20- or 4-bytes long (depending if we store the
> full object ID, or some 4-byte identifier that allows us to discover
> it), offset is 8 bytes long, and xor_pos is 4-bytes (since in practice
> we don't support packs or MIDXs which have more than 2^32-1 objects).
>
> In the event that this table doesn't fit into a single cache line, I
> think we'll get better performance out of reading it by not forcing the
> cache to evict itself whenever we need to refer back to the object_id.

Ok, will look into it.

> I mentioned in my reply to Stolee earlier, but I think that we should
> either (a) try to remember what this is for and document it, or (b)
> remove it.

Let us for now remove it.

Thanks :)



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux