Hello: I am pleased to announce version 2.0 of kernel.org's git mirroring software, grokmirror. This is a major rewrite that intentionally breaks the upgrade path from grokmirror-1.x due to significant backend changes requiring replica administrator's thoughtful consideration -- please see the UPGRADING.rst document provided with this release. ## New in grokmirror-2.0 - Drop support for python < 3.6 - Introduce "object storage" repositories that benefit from git-pack delta islands and improve overall disk storage footprint (results will directly depend on the number of forks). - Drop dependency on GitPython: use git calls directly for all operations - Remove progress bars to slim down dependencies (drops enlighten) - Make grok-pull operate in daemon mode (with -o) (see contrib for systemd unit files). This is more efficient than the cron mode when run very frequently. - Provide a socket listener for pubsub push updates (see contrib for Google pubsubv1.py). - Merge fsck.conf and repos.conf into a single config file. This requires creating a new configuration file after the upgrade. See UPGRADING.rst for details. - Record and propagate HEAD position using the manifest file. - Add grok-bundle command to create clone.bundle files for CDN-offloaded cloning (mostly used by Android's repo command). - Add SELinux policy for EL7 (see contrib). ## Object Storage Repositories Grokmirror 2.0 introduces the concept of "object storage repositories", which aims to optimize how repository forks are stored on disk and served to the cloning clients. When grok-fsck runs, it will automatically recognize related repositories by analyzing their root commits. If it finds two or more related repositories, it will set up a unified "object storage" repo and fetch all refs from each related repository into it. For example, you can have two forks of linux.git: torvalds/linux.git: refs/heads/master refs/tags/v5.0-rc3 ... and its fork: maintainer/linux.git: refs/heads/master refs/heads/devbranch refs/tags/v5.0-rc3 ... Grok-fsck will set up an object storage repository and fetch all refs from both repositories: objstore/[random-guid-name].git refs/virtual/[sha1-of-torvalds/linux.git:12]/heads/master refs/virtual/[sha1-of-torvalds/linux.git:12]/tags/v5.0-rc3 ... refs/virtual/[sha1-of-maintainer/linux.git:12]/heads/master refs/virtual/[sha1-of-maintainer/linux.git:12]/heads/devbranch refs/virtual/[sha1-of-maintainer/linux.git:12]/tags/v5.0-rc3 ... Then both torvalds/linux.git and maintainer/linux.git with be configured to use objstore/[random-guid-name].git via objects/info/alternates and repacked to just contain metadata and no objects. The alternates repository will be repacked with "delta islands" enabled, which should help optimize clone operations for each "sibling" repository. Please see the example grokmirror.conf for more details about configuring objstore repositories. ## Space savings using object storage repositories Any disk space savings will depend on how many repositories are forks of each other. For git.kernel.org, which already aggressively used alternates for all linux.git forks, we saw reduction from 60GB to 20GB for the entirety of git.kernel.org content. On some of the codeaurora.org systems, especially those containing a lot of pre-release forks of entire AOSP repo collections, we saw space usage go from 3TB to under 1TB. ## Stability This release has proven pretty stable and has been operating on git.kernel.org and a subset of codeaurora.org systems for over the past month. However, since the trickiest part is initial repository conversion towards the use of object storage repos, we urge proceeding with caution. Please study the UPGRADING.rst document before making any changes to your infrastructure. With all support questions, please email tools@xxxxxxxxxxxxxxxx. Best regards, Konstantin
Attachment:
signature.asc
Description: PGP signature