On Wed, Dec 16, 2020 at 02:22:52PM +0100, Daniel Klauer wrote: > Background: bitbake downloads git repositories during a build process > and supports caching them locally (in form of bare repos in some > user-defined directory). This prevents having to re-download them during > the next build, and also it is a convenient mirroring/backup system in > case the original URLs stop working. > > As far as I can tell (since I'm not a bitbake developer) the git > pack-redundant invocation is one of multiple calls meant to improve > storage (probably minimize disk usage) of the locally cached git repos. > For reference, please take a look at the other git commands it's > invoking [1], and at the commit messages of the commits that added these > invocations [2] [3] [4]. > > If doing it that way seems wrong, I'll report the issue to bitbake > upstream too. Maybe there is a better way to do whatever bitbake wants > to do here? Thanks for that context. I don't think it's _wrong_, in the sense that what they want to do (remove redundant packs) is a reasonable thing to want. But in practice I suspect that it rarely helps. It only makes sense if a pack is fully made redundant by other packs. But that is unlikely to happen after a fetch, because Git tries not to send objects that already exist. So while there could be overlap, it's unlikely that full packs are candidates for deletion. And if any are, then that is probably a sign that fetch is not being given enough information (e.g., if there are packs being copied into the repo behind the scenes, make sure that there are matching refs pointing to their objects, so Git knows it has that part of the object graph). For saving space, "git repack -ad" is a much better option. It puts everything reachable into a single pack, which means: - if two packs contain duplicates of an object, we'll end up with only a single copy, even if those packs also contained some unique objects - by putting all objects in the same pack, we have more opportunities for delta compression between similar objects - we'll drop any unreachable objects completely (presumably this is desirable here, but if they're trying to keep objects that don't have refs pointing at them as part of some caching scheme, they might not; passing "-k" will keep the unreachable objects, too) Since they're doing other maintenance like "pack-refs", then running "git gc" may be preferable, as it would cover that, too. Use "--prune=now" to drop the unreachable objects immediately (as opposed to giving them a 2-week grace period). Note that there's no equivalent to repack's "-k" from git-gc", so if they need that, they'll have to invoke git-repack directly. -Peff