On Tue, Aug 16, 2016 at 12:46 PM, Randy Barlow <bowlofeggs@xxxxxxxxxxxxxxxxx> wrote: > On Tue, 2016-08-16 at 11:24 -0500, Jason L Tibbitts III wrote: >> It would also help to have the following information. The mirrors >> will >> need to have this information in order to make informed >> decisions. (I >> will also have to make changes to quick-fedora-mirror to >> accommodate.) >> >> 1) How much content will the mirrors need to store? How will this >> amount change over time? > > Hello Jason! I confess that I don't have good answers to your > questions, and I'm not sure who would. Many of these questions depend > on how popular Docker images become with Fedora packagers. > > How many bytes of content we will be creating does depend on how many > applications get packaged as Docker images. I would guess the base > image to be a few hundred megabytes, but we can probably use some fancy > hardlinking to help reduce disk/network usage so that the base image is > only stored once. The rest of the storage is going to be the diffs > applied as layers on top of the base image that add whatever each > individual image needs. The sizes of these layers will vary greatly by > application, so this is also difficult to guess. > > It's difficult to make informed guesses about this since I don't know > how many Docker images the fedora packagers will create (or at what > rate they will create them over time). > >> 2) Do you have a plan for placing an upper bound on the total amount >> of >> data? (In Fedora things are moved to archive, though that has its >> own problems and of course doesn't really place an upper bound on >> anything.) > > I don't have such a plan at this time. If anyone has suggestions about > this, that would be helpful. It's unclear whether Docker images would > live inside or outside of the traditional Fedora cycle (i.e., > F24/F25/F26). It may have its own separate cycle, or we may just go > with the current Fedora cycle. > I think we can choose a reasonable archival time, sorting out the implementation of that with the tool the does the layer data extraction might be a challenge but I imagine it's one we can collectively sort out if necessary. >> 3) How much change do you expect per day? Churn is really important, >> and even now we can come close to the point where the master >> mirrors >> simply can't feed new content to the tier 1 mirrors fast enough >> for >> them to keep ahead of the changes we're making. > > This again depends on how popular the Docker image offering becomes > with our packagers, so it is difficult for me to make an educated > guess. Popularity is difficult to predict. The current plan is that we will release Docker Layered Images on a Two-Week cadence, potentially in line with the Atomic Host Two-Week deliverable. This might change in the future but that is the current plan. > >> 4) How will this be organized on the master mirrors? It really >> should >> be in a separate rsync module, and the archive (if that happens) >> should also be in a separate rsync module. > > In my proposal e-mail I mentioned that it was important for mirror > manager to allow mirror admins to opt-in to hosting Docker content. > Since we don't know the answer to so many of these questions, I suggest > we opt mirrors out by default, and let admins opt themselves in as they > please. Our proposal didn't have an exact path for storing Docker > images, but it was planned to be separated from the RPM and ISO content > at a fairly high level in the tree. +1 -AdamM > > I apologize for having so few answers. If anyone can shed more light on > Jason's questions, please reply. > _______________________________________________ > infrastructure mailing list > infrastructure@xxxxxxxxxxxxxxxxxxxxxxx > https://lists.fedoraproject.org/admin/lists/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx _______________________________________________ infrastructure mailing list infrastructure@xxxxxxxxxxxxxxxxxxxxxxx https://lists.fedoraproject.org/admin/lists/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx