Re: [PATCH v4 4/6] pack-objects: generate cruft packs at most one object over threshold

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Taylor Blau <me@xxxxxxxxxxxx> writes:

> When generating multiple cruft packs with 'git repack --max-cruft-size',
> we use 'git pack-objects --cruft --max-pack-size' (with many other
> elided options), filling in the '--max-pack-size' value with whatever
> was provided via the '--max-cruft-size' flag.
>
> This causes us to generate a pack that is smaller than the specified
> threshold. This poses a problem since we will never be able to generate
> a cruft pack that crosses the threshold.

So far I see absolutely *NO* problem described in the above.  The
user said "I want to chop them into 200MB pieces but do not exceed
the threshold" and the system honored that wish.

> In effect, this means that we
> will try and repack its contents over and over again.

The end effect however may be problematic, but isn't it due to the
way when to repack is determined?  You see 199MB piece of cruft pack
plus some other cruft data.  You have generated no new cruft and no
existing cruft expired out, but you do not know these facts until
you try to repack.  Because 200MB is the limit, you include the
199MB one as part of the ones to be recombined into the new cruft
pack because 199MB is smaller than 200MB and you do not know that
the reason why it is 199MB is because the earlier repack operation
found all remaining cruft material to be larger than 1MB; if there
were a 0.5MB cruft, it may have made it closer to 200MB.

So would it be feasible to remember how 199MB cruft pack is lying in
the object store (i.e. earlier we packed as much as possible), and
add a logic that says "if there is nothing to expire out of this
one, do not attempt to repack---this is fine as-is"?

> Instead, change the meaning of '--max-pack-size' in pack-objects when
> combined with '--cruft'. When put together, '--max-pack-size' allows the
> pack to grow larger than the specified threshold, but only by one
> additional object.

I do not think that would work well.  You have no control over the
size of that one additional object---it may weigh more than 100MB,
combining your 199MB cruft pack with something else to make it ~300MB
cruft.  In other words, "just a little bit larger" sounds like a
wishful thinking handwaving.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux