Bundles: Partial Clone & Shallow clone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

I was wondering if anybody has made a start towards easier ways to
create bundles of partial & shallow clones.

I'm looking at potential ways to improve the various snapshot &
incremental update mechanisms that we have available in Gentoo Linux's
tree distribution.

Right now, the various mechanisms available are:
- git tree WITHOUT generated metadata (git-daemon, smart-http)

Plus all of the following WITH generated metadata.
- git tree (git-daemon OR smart-http)
- daily full snapshots in several formats/layouts.
- deltas of full snapshots in some of the formats & layouts [1-day interval, no rollback]
- rsync:// tree [30-min intervals, no rollback]

Offline/Air-gapped use cases presently are expected to use the
snapshots+deltas, but the 1-day cadence is longer than desirable

Several of these formats cannot rollback unless they kept the old tree
or the snapshot it was generated from.

rsync:// performance is heavily impacted by network latency and large
number of small objects being transferred.

What I'd like to offer instead, is CDN-replicated bundles generated at
regular cadences, with the absolute minimal content; taken from the git
tree WITH generated metadata. The git tree would have tags for every
cadence point (ideally 30-minutes, with potential pruning of old tags).
The bundles would have GPG-signed checksums separately included, to
provide verification of the updates.

These come in two variants:
1. Daily full snapshots, equivalent to depth=1 clone. 
2. 30-min & daily incremental bundles, using partial clone (needs to
   include the new blobs that would be present when up to date, and
   knowledge of which files can be deleted).

This should let users load some consistent set of daily or 30-min bundle
onto their gitdir, and hop between those tags [they would not have
gaps].

If they did have gaps due to a missing bundle or wanting to go within
the cadence points, and were online, they could use the partial
mechanisms to fill in their tree as needed.

Right now, I can naively generate the snapshots by explicitly making a
new detached shallow clone & then generating a bundle of that. 

Incremental bundles are already possible, but presently include all
commits, trees & blobs between two points.

The bundles are already generally smaller than our prior snapshots and
deltas, but I'm looking to make the process easier and cover the
remaining gaps [if we have a high-change period, then the bundle winds
up bigger than other deltas, because of the intermediate blobs].

Ask 1. Ability to generate shallow bundle without the intermediate clone step.

Ask 2. For incremental bundles, ability to exclude blobs not needed by
       the latest commit (and it's tree).

Both of these I think would be possible by adding some variant on the
--filter mechanism to git-bundle.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robbat2@xxxxxxxxxx
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux