Re: RFC: Old packages remain on the mirrors for one week

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Richard Hughes (hughsient@xxxxxxxxx) said: 
> On 12 August 2013 19:36, Bill Nottingham <notting@xxxxxxxxxx> wrote:
> > I could see doing this, but it is a non-trivial change to how the
> > repositories are made, and there aren't really any resources assigned to
> > work on that. I can give some pointers to where and what would need changed,
> > if you're interested in looking at it.
> 
> Yes please, thanks.


Ok.

= HOW PACKAGES GET INTO REPOSITORIES =

1. Packages are tagged into a koji tag. For rawhide and some branched trees,
this is done directly after build (i.e., tagged into f20). For updates, they
are tagged into <release>-updates or <release>-updates-testing by bodhi as
part of the push process.

2. 'mash' is called to create a repository. You can find the mash code in fedpkg, or at:
	https://git.fedorahosted.org/cgit/mash
    This is done either by bodhi (as part of an updates push), or by a
rawhide/branched compose script.

3. The created repository is then rsynced to the mirror master.

4. The mirrors then sync it from there.

= HOW MASH CREATES THE REPOSITORY =

1. Calls into koji to get a list of all packages built for a tag (such as
'f20' or 'f19-updates')

2. Sorts them by architecture, noarch, whether they're signed, etc.

3. Downloads them into a directory. Runs createrepo (via python API).

4. Solves the repositories for multilib. Removes packages not wanted as
multilib. Runs 'createrepo --update' (via python API).

...

>From the above, you'll note that the repository, whether it's rawhide,
updates, or something else, is created fresh every time. So, it's initially
incompatible with your proposal.

Potential solutions, off the top of my head:

1) Change mash to optionally keep more than one version of a package.

When mash talks to koji to get packages in a tag (via
session.listTaggedRPMS), it passes a boolean flag to either grab the latest
package (the default) or all packages. It could be possible to allow this to
be a number, on the mash side, whereby mash would retrieve info on *all* tagged RPMs
from koji, and only keep/download the last N. 

Pros: doesn't require changing any other tools

Drawbacks: makes the mash process slow, if you're sorting tens of thousands
of builds to limit it to the last few versions. Would require a bit of extra
code to keep the same concept of latest (last tagged in, *not* newest NVR)
that koji does.

Where to look: mash/__init__.py:doCompose(), ~line 290-300.

2) Don't just rsync the mashed repository over, but change it to merge in
the old packages

Drawbacks: kind of ugly to do. Involves copying lots of data around.

Where to look: Would require changing the buildrawhide/buildbranched scripts
from:
	https://git.fedorahosted.org/cgit/releng/tree/
to do the merge before rsyncing into place, when considering
rawhide/branched trees. For updates, likely involves changing code in bodhi
itself.

3) When rsyncing repositories over, run the rsync without delete, and have a
different script do cleanup later.

Pros: simpler than #2

Drawbacks: requires writing a new scheduled task to clean old cruft out of
the repo

Where to look: Changing the same places as #2 - buildrawhide/buildbranched,
and bodhi.

There are almost certainly more radical ideas out there as well for how to
do this. Feel free to holler if you have more questions.

Bill
-- 
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]
  Powered by Linux