Re: [PATCH v3] repack: enable bitmaps by default on bare repos

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, May 04 2019, Jeff King wrote:

> On Thu, Apr 25, 2019 at 04:16:46PM +0900, Junio C Hamano wrote:
>
>> I was revisiting the recent "What's cooking" report, and I am not
>> sure what the current status of the topic is.
>>
>> I do not get a feel that the current bitmap implementation has been
>> widely used in repositories that have vastly different access
>> patterns---it probably has been tried only by those who can afford
>> the engineering cost to see if the implementation happens to work
>> well for their workload and some may have chosen to adopt it while
>> others didn't.  So it may be very well tuned for the former people
>> but once we merge this topic down, we'll hear from others with quite
>> different workload, which may lead to us tuning the code to bit
>> better to their workload while not hurting other existing users,
>> hopefully.
>>
>> Or not.
>
> Note that Ævar's case was somebody running bitmaps locally and trying to
> push, which I think is generally not a good match for bitmaps (even when
> they work, they cost more to generate than what you save if you're only
> pushing once).

Right. It was *not* caused by this "enable bitmaps by default on bare
repos" patch (which I wasn't even running with), but *is* indicative of
a pretty big edge case with enabling bitmaps that *will* happen for some
on such bare repos if we ship the patch.

> The goal of Eric's patch was that by kicking in for bare repos, we'd
> mostly be hitting servers that are serving up fetches. So if by
> "workload" you mean that we some people might use bare repos for other
> cases, yeah, there's a potential for confusion or regression there.

As noted I suspect that's a really rare use-case in practice, and in
reply to Junio's <xmqq1s1qy2ox.fsf@xxxxxxxxxxxxxxxxxxxxxxxxx> upthread I
think it's fine to just try this. Maybe we'll finally get such use-cases
out of the woodworks by turning it on by default.

As an aside this is the Nth time I notice how crappy that "Enumerating
objects" progress bar is. We do a *lot* of things there, including this
bitmap calculation.

But just splitting it up might result in either no progress (all
individually below 2 seconds), or a lot of noise as you have 20 things
that each take 2 seconds. I wonder if someone's looked at supporting:

    Enumerating Objects (X%) => Calculating bitmaps (Y%)

Where X% is the total progres, and %Y is the sub-progress. I eyeballed
doing this once by "chaining" the progress structs, but there's probably
a less crappy way...


> If you mean that bitmaps might not work for some workloads even when
> we're serving a lot of fetches, I won't say that's _not_ true, but my
> experience is that they are generally a net win. Both for the smaller
> repositories we see on github.com, but also for big, busy ones that our
> on-premises customers use.
>
>   Actually, there is one curiosity with Eric's patch that I haven't
>   tested. As I've mentioned before, we store "forks" as single
>   repositories pointing to a single shared alternates repository. Since
>   the bitmap code only handles one .bitmap per invocation, you really
>   want just one big one in the shared repo. If "git repack" in the forks
>   started generating one, that would be surprising and annoying.
>
>   In practice this is a pretty extreme corner case. And a lot would
>   depend on how you're using "repack" in the fork (e.g., a partial
>   repack would know that it can't generate bitmaps anyway). I'm pretty
>   sure it would not even impact our setup at all, but I can probably
>   come up with a devils advocate one where it would.
>
>> I am somewhat tempted to make things more exciting by merging it to
>> 'next' soonish, but I guess Ævar and you are not quite ready for
>> that excitement yet, judging from the following (which looks quite
>> sensible suggestions)?
>
> It's OK with me for this to go to 'next'. Note that the other two
> patches from me could actually graduate separately. One is a
> straight-out test fix, and the other should always be a win (and does
> nothing if you're not already generating bitmaps).
>
> By the way, there were some timing puzzles mentioned in that second
> commit. I re-ran them today and everything was what I'd expect. So I
> wonder if I just screwed up the timings before. I can re-write that
> commit message if it hasn't made it to 'next' yet.
>
> -Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux