Re: Proposal to mirror Docker images

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2016-08-19 at 12:19 +0000, Patrick Uiterwijk wrote:
> On Tue, Aug 16, 2016 at 3:33 PM, Randy Barlow
> > docker
> > ------
> > 
> > The most significant work required will likely be modifying the
> > docker
> > client to enable it to properly handle the metalink responses it
> > will be
> > receiving from Mirror List. When requesting the manifest, it will
> > receive a metalink document that will give it a priority ordered
> > list of
> > mirrors. It will need to work through the list in order until it
> > reaches
> > a mirror that has the correct checksum for the requested manifest.
> > It
> > will then use that same mirror for the subsequent blob requests.
> 
> So, we get a mirrormanager module (previously called repo) for each
> different
> container we ship? Since currently the metalink level is on repo-
> arch, and that
> is the level on which it retains and sends the checksums as part of
> the metalink.
> If this is the case, I think we have quite a bit more work ahead for
> mirrormanager
> to make it able to work with lots of modules.

Hello Patrick, thanks for your reply!

I don't think anyone has decided on this particular detail yet, though
you raising it is good because we should probably figure it out soon. I
personally wouldn't think we would want to offer a module for each
container, as that could get quickly out of hand UI-wise. If Fedora
ends up with a container per application, this could easily be hundreds
or thousands of modules.

However, that still leaves the question of what *should* a module be?
Should all of Docker just be one module, so mirrors can choose all-or-
nothing? Or, if Docker containers end up getting based on Fedora
releases (like F23/24), could an example module be "F23-docker-images"? 
Or perhaps we could add in arch like "F24-x86_64-docker-images"?

It has also been suggested that the Docker images could be like Atomic,
where there is a release every two weeks and nothing is supported
longer than that. If we went that route, the only thing I can think of
is to either do all-or-nothing docker, or arch-specific modules.

Does anybody have any thoughts or other ideas on where would be good to
draw the lines here?

> Also, will we be signing the images, or is the using of metalink the
> only security
> people get? Since I can guarantee you that some people will feel like
> their closest mirror
> is from their competitor and not want to use it, or that they hit
> mirrorlist at one point
> when it's down, and they will stop using the metalink, and just
> insert
> $randommirror
> directly, and use that.
> I would want to make sure that even if they do this, they still get
> some sort of
> verification of the data. This was previously mentioned as one of the
> main blockers
> for getting this stuff mirrored out.

You raise an important concern that I also share here. Docker Manifests
have a built-in signature feature. To be honest, I have not learned the
details about the signature feature, and worse, my personal Manifest
familiarity is limited to their v2 schema 1 format[0] which has been
obsoleted by a schema 2 format that I have not yet familiarized myself
with. So let's say that <hand_waving>yes, we can sign the Docker
Manifests</hand_waving> somehow.

The Blobs (Docker calls the Image layers Blobs) themselves are not
signed, but they are referenced by checksum in the Manifest. You can
see an example Manifest at [0]. Thus, theoretically, if the user trusts
the Manifest because it is signed by Fedora, they should be able to
trust the Blob layers that they download so long as they do match the
expected checksum. The Docker client does seem to check the checksum in
my experience.

By the way, the metalink response from Mirror List will include the
expected checksum for the manifest that the client is trying to pull.
Thus, in addition to Fedora signing the Manifest itself, we can also
have the client validate the checksum of the Manifest they receive from
the mirror. If we ensure that clients always communicate with Mirror
List over TLS, this will add another layer of validation for us.

Due to a technical detail about how the Docker client works, users will
not be able to docker pull from a mirror of their choosing with the
design I am proposing here. When the docker client attempts to pull,
the first thing it does it to perform a GET /v2/. The registry *must*
respond with a particular header that indicates that it is a Docker v2
registry, and include {} as the body response. Due to the way our
mirrors currently operate, I do not believe we will be able to have the
/v2/ path without negotiating for that with our mirrors. I suspect that
many mirror admins would not like to give us that path. This is why the
plan is to have all Fedora docker clients perform docker pull against
Mirror List, so that Mirror List can send the required /v2/ response
and then send them the metalink when the user requests the manifest.

There are two technical solutions that can resolve the problem I
described in my last paragraph, but they will both require mirror
admins to agree to things they may not want to agree to. One is that we
could negotiate with mirror admins to allow us to store "docker stuff"
at /v2/ on their mirrors. Another possibility is for mirror admins to
each run a Docker registry. I think each of these will present a bit of
a negotiating challenge for us, which is why I proposed the particular
plan I wrote up instead of either of these. However, if anyone wants to
explore these options further please feel free.


> > 
> > 
> > There is some concern that such a feature would not be accepted by
> > the
> > upstream docker project. If we were to proceed with this proposal,
> > we
> > would propose this patch to the upstream Docker project. If
> > upstream
> > were not willing to accept the feature, we would need to have the
> > Fedora
> > docker packager carry this patch as a downstream add on.
> 
> So that would mean that the Fedora docker images are only available
> for
> people that try to run it on a Fedora host or that we manually tell
> "Yeah, use this mirror url that's outside of our control"? I'm not
> sure that this
> will go over smoothly with other people..

Unfortunately I believe you are correct, except that users also won't
be able to point their docker client at our mirrors. The only way we
could enable the unpatched docker client to pull Fedora content would
be if we allowed users to access the Docker Registry that the OSBS
builds are pushed into (which is where we will be getting the content
from as well, to distribute to the mirrors). The unfortunate thing
about this proposal is that other than the OSBS registry, there is no
other single system that can handle the entirety of the requests coming
from a docker pull command.

> Do you have any idea how likely it is to get this patch accepted into
> upstream?
> From what I've heard, the Docker people are not really happy about
> merging
> distro-specific things, so this is a considerable risk, unless we
> will
> just accept
> that users that are not running a fedora host can't run our images...

It is difficult for me to answer this question, but Docker does have a
reputation of refusing contributions that help enable competing
registries. I would say that there is a strong chance that the patch I
am proposing would not be accepted upstream.

If it is important to us that non-Fedora users can docker pull our
images, I can think of two options:

0) Negotiate with mirror-admins to allow us to either have /v2/ on the
   mirror, or to run a full registry instead of just rsyncing Docker
   content from us.

1) Allow non-Fedora users to docker pull from the same registry that
   OSBS is putting its content into. This may be difficult to scale
   without a CDN.

Given these new "holes" that are revealed in my proposal, what does
everyone think? Patrick raised some valid concerns here. I'd like to
make sure we're going a good direction before going forward with these
plans. Please speak up if you find any of the above to be "show
stoppers", or if you can think of alternative solutions that I've
overlooked.

Thanks everyone for the conversation so far!


[0] https://docs.docker.com/registry/spec/manifest-v2-1/

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
https://lists.fedoraproject.org/admin/lists/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx

[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux