Delta updates for Flatpaks - implementing in Fedora infrastructure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Introduction
=========
The original way that Flatpaks were updated was via ostree
repositories. More recently, the ability was added to distribute them
as container images, and that is what we do for the Fedora build
Flatpaks via registry.fedoraproject.org.

The one big gap we have between the two ways of distributing Flatpaks
is delta updates - if you have downloaded a Flatpak from an ostree
repository, you can update to the next version via the raw ostree
protocol - downloading only changed files one, and even better, if
"ostree static deltas" have been properly computed and stored in the
upstream repository, you can get one big blob that uses bsdiff and
other techniques to efficiently compress the differences

This spring, Alex Larsson came up with a way of updating containers
with deltas *in general* and implemented that in podman
(https://github.com/containers/image/pull/902, still pending) and in
Flatpak (support released with Flatpak 1.8.). See:

 https://blogs.gnome.org/alexl/2020/05/13/putting-container-updates-on-a-diet/

This is inspired by ostree static deltas, but modified to fit into the
container world - the end result is both super simple and works
remarkably well.

The question then becomes: how do we generate deltas for the Flatpaks
we ship for Fedora and make them available to users?

Generating static deltas for Flatpaks
===========================

There are currently 3 different codebases I maintain to wrangle
Flatpak metadata:

  regindexer: script that queries registry.fedoraproject.org and
writes an index of Flatpaks for use by Flatpak clients (deployed in
Fedora infrastructure currently)
  https://pagure.io/regindexer

  flatpak-status: daemon that queries bodhi and koji, figures out what
Flatpaks are out-of-date, and generates a JSON file used to create a
web user interface
  https://fedora.fishsoup.net/flatpak-status/
  https://github.com/owtaylor/flatpak-status

  flatpak-indexer: daemon that queries the Red Hat container api, and
internal Koji instance, and writes an index of Flatpaks for use by
Flatpak clients (e.g.,
https://flatpaks.redhat.io/rhel/index/static?label:org.flatpak.ref:exists=1&tag=latest)

My path forward here was: take the code from flatpak-status that
queries bodhi and koji, use it to teach flatpak-indexer how to handle
Fedora Flatpaks as well as RHEL flatpaks, then add the capability to
generate deltas.

The result of this can be found:

 https://github.com/owtaylor/flatpak-indexer

it seems to work fine - it generates index and deltas for Fedora
Flatpaks that work in limited testing. (The download for updating
berusky from the last stable version in Fedora to the current one was
reduced from 5.5MB to 18k)

Distributing static deltas for Flatpaks
===========================
The eventual goal of our delta project is to upload them to container
registries - as described in Alex's blog post - and we've been in
discussion with the Quay.io folks to figure out the best way to do
this. But we didn't want to block static deltas on a) having OCI
artifact support on quay.io b) having a finalized way to do delta
updates as OCI artifacts and a white-listed MIME type c) getting
Fedora switched to quay.io d) having Red Hat built containers hosted
on a container registry. So we added a second path to Flatpak - the
image index that Flatpak consumes can point to a "delta manifest" as
an HTTP URL, and that can point to the individual layer deltas also by
HTTP URL.

For now, flatpak-indexer doesn't upload the delta manifests or layer
deltas - it just writes them into the static data along with the
indexes and icons, and the points to them from the index.

Architecture of flatpak-indexer
=======================
flatpak-indexer shares a property with the currently deployed
regindexer - it is entirely generating static content. The overall
components are:

redis: used to cache data retrieved from "upstream" (koji and bodhi
for Fedora), and to communicate between the indexer and differ
containers
indexer: the main container - it periodically retrieves data from
upstream sources, determines what layers need deltas, queues them up
for the differ containers, collects the results, and writes the delta
manifests and indexes
differ: containers that execute the expensive 'tar-diff' operation to
compute the layer deltas. The number of containers can potentially be
scaled based on the number of queued layer deltas
frontend: an apache server that serves up the generated data - with
appropriate redirects and headers

This is set up in openshift internally, and would presumably be done
the same way in Fedora infrastructure, though we could potentially
leave the "frontend" role to sundries as it is currently with the
regindexer generated index and icons.

Resource consumption
==================
The 'redis' and 'indexer' containers are lightweight - intermittent
usage of 1 cpu, maybe 256MB of memory. The differ containers need to
be more beefy - they could use 1-2 cpus and 2-3GB of memory. But they
are only needed intermittently and are otherwise idle. With some added
complexity, they could be scaled to zero, and only scaled up when
there are tar-diffs to process.

The disk usage for index+icons+deltas for the current Fedora Flatpak
set is 347MB - this will increase proportional to the number of
Flatpaks in Fedora, but I wouldn't expect it to ever be *much* bigger
- deltas pointing to old versions of images will be cleaned up.

Future work
=========
The current status is good enough to deploy as a delta solution, but
some more future possibilities:

* Teach flatpak-indexer how to upload deltas to a container registry
as OCI artifacts
* Use flatpak-indexer to generate indexes for Fedora containers that
are not Flatpaks, useful for
* Make flatpak-indexer the backend to the Flatpak status web page,
move that inside Fedora infrastructure
_______________________________________________
infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx




[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux