Re: sorting of git N-V-R tags in rpm package repositories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 5 May 2020 at 18:46, Florian Weimer <fweimer@xxxxxxxxxx> wrote:
>
> > Hey Florian,
> >
> > On Mon, 4 May 2020 at 10:03, Florian Weimer <fweimer@xxxxxxxxxx> wrote:
> >>
> >> > as part of https://hackmd.io/kIje9yXTRdWITwP7cFK2pA (annotated tags
> >> > pushed by package maintainers) effort, I revisited the sorting
> >> > algorithm that is used to determine the "latest" tag for a given
> >> > package which is needed to determine correct package version.
> >> > Basically, if the current commit is tagged, then the N-V-R information
> >> > from that tag name is directly used to render version or release
> >> > (depending on macro usage). If the latest tag is on some older commit,
> >> > we still use information from it but the version (or release) string
> >> > will contain some appendices like .git.4.abcdef12 to mark a commit
> >> > offset from that latest tag. Note that only tags accessible from the
> >> > current branch tip when traversing git history backward are considered
> >> > to pick the latest one (i.e. tags on other separate branches are not
> >> > considered).
> >>
> >> Is this really necessary?  Koji goes by latest tagged build, it does not
> >> sort by NVR when constructing the buildroot.  The same thing seems to
> >> apply to composes (but I am less sure about that).
> >
>
> > is it certain? I tried to poke around koji/kojira code, kojí IRC, and
> > also logs at: https://koji.fedoraproject.org/koji/tasks which are very
> > slow for owner:kojira, state:all but I couldn't really determine so
> > far whether only the latest build of a package gets to "build"
> > repository (that is used for builds subsequently) for a given tag.
>
> The latest tag for a source package name wins for the Koji-generatged
> repository.  I don't know what happens if different source packages
> build subpackages of the same name.
>
> > If you build a package with an older version than what was there for
> > previous build, does that older version overrides the newer one? I
> > would need to poke more to find out. But it would be very interesting
> > to know.
>
> Yes, if it gets tagged into the buildroot, it replaces a build with a
> higher NVR.

Good to know. But hard to say now how much inspiration can be really
drawn from this.

>
> >> Tags can also be added retroactively and backdated.  These things
> >> conflict with the advantages you list (in particular, with NVR
> >> auto-generation, git is not the sole source of truth).
> >
> > If the tag ordering function is done properly, I believe even
> > retroactive tagging (i.e. tagging into past) and/or tag backdating
> > would be supported and NVR auto-generation would work correctly. So I
> > don't think it needs to conflict. But can you perhaps expand more on
> > "Tags can also be added retroactively and backdated", please?
> > I.e. why/when would one do that.
>
> No, you can push tags with incorrect dates.  This can change the
> auto-generation.

It really depends on what the tag ordering function is. If the
function does not consider dates at all, this wouldn't be a problem.

>
> You can only avoid this if you use data from commits (both current and
> earlier) *on the same branch* exclusively for generating metadata (or
> hash-linked from there).  Everything else can get of sync and change
> over time even if the commit hash stays the same, so the repository
> state at a specific commit hash is longer the sole source of truth.
> (Because you need to reconstruct that other state *at the right time*.)

What do you mean by "everything else which can get out of sync and
change"? If you are talking about tags (or refs in general), it's true
that you can add tags into past which may or may not affect
auto-generation depending on the ordering function. You could
potentially also remove some tags if you haven't distribute them yet.
But you should never push a tag with the same name that you have
pushed before (https://git-scm.com/docs/git-tag#_on_re_tagging) - that
makes the removal mentioned in the previous point kind of something
which should perhaps be forbidden as well.

So given this:
1) if we order tags by time of their creation, then if you create a
new tag on the past commit, that tag will be considered the latest for
the tagged commit itself and all the commits "above it" (i.e. commits
further in git history) even if those commits have other tags (with
older creation dates) assigned to them. These won't be used for
auto-generation and instead the new intruder will be. If you try to
build from some commit further in history now (e.g. by specifying an
older tag for that commit which will lead to its checkout in the build
system or just commit ID directly), the package N-V-Rs you start to
get will look e.g. something like this: foo-1.0-3.git.8.abcdef12.
foo-1.0-3 was the most recently created tag and we get the respective
.git. suffix because the commit we want to build will be 8 commits
past that latest tag and will have a hash starting with abcdef12.
2) if we order tags primarily by tagged-commit creation date and
secondarily by tag creation date, then tags into past do not play any
role given that there is some other tag placed on a later commit.
Again, I am assuming we are in a detached-head state where we have
just checked out some commit ID or a tag-name, which should now be
built. So if there is that "later tagged commit" between our current
detached head and the new, into-the-past created tag, auto-generation
won't be affected. If, however, there isn't such later tagged commit,
it means that we will get a different N-V-R than before when we built
from that same commit. That new N-V-R will contain the newly created
tag and might contain the .git.<n>.<hash> suffix appended to it if the
checked out commit is later in history than the commit for the
past-tag.
3) if we order tags not by time but their placement in git history
topology, there is a problem that this doesn't give us a sorting
criterion for tags which are at the same distance from the current
HEAD. So we need to apply another criterion to resolve this, e.g.
again tag-creation date or alphabetical sort. If we do this, we get
basically the same result as for 2). Tags into past only affect
anything if there is no later tag between HEAD and that newly created
past tag. Now "later tag" is being primarily determined by topology
instead of commit date. These two are equivalent given that people
have correct time set on their machines.
4) if we order tags by rpm sort on N-V-R tag names, tags into past
play a role if they suddenly become the latest ones with respect to
the current HEAD being built. "Latest ones" according to rpm sort and
they need to be reachable from HEAD to be included into the sort in
the first place.

I am not sure whether I am not over-complicating things now. I.e. how
much tagging into past is a real use-case and when it would be useful.

>
> That's why I proposed to auto-generate release numbers and changelogs
> based on the commits going back to the last Release: line and %changelog
> section update in the spec file.  That would be stable (unless the tool
> changes how it generates those spec file parts).

I don't think you can automatically generate %changelog and Release
and at the same time base their auto-generation on their last change
in spec file in git history. That somehow doesn't seem that it would
work.

There were suggestions to do the auto-generation based on the last
Version change. That would have the property that when you build from
the same commit again and again, you will get the same N-V-R and
changelog no matter what happens later in time.

That's a nice property but I am not sure it is something we
necessarily need. With tags it may happen that you get a different
N-V-R or changelog if someone created a new tag into-the-past but it
won't happen that you get the same N-V-R that you have got at some
other point in time. I.e. N-V-Rs are unique even though they may
change in time for the same commit.

Tagging into past is imho rather a theoretical use-case but useful to consider.

>
> Thanks,
> Florian
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux