sorting of git N-V-R tags in rpm package repositories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

as part of https://hackmd.io/kIje9yXTRdWITwP7cFK2pA (annotated tags
pushed by package maintainers) effort, I revisited the sorting
algorithm that is used to determine the "latest" tag for a given
package which is needed to determine correct package version.
Basically, if the current commit is tagged, then the N-V-R information
from that tag name is directly used to render version or release
(depending on macro usage). If the latest tag is on some older commit,
we still use information from it but the version (or release) string
will contain some appendices like .git.4.abcdef12 to mark a commit
offset from that latest tag. Note that only tags accessible from the
current branch tip when traversing git history backward are considered
to pick the latest one (i.e. tags on other separate branches are not
considered).

Originally, I sorted tags to find the "latest one" by using git
directly and the criterion was -taggerdate which means the latest
created tag by time goes first.

I realized this will mess up the ordering in case a packager would
create a tag into the past (i.e. on some older commit). That tag,
while belonging to an older commit and probably having an older NVR,
would be selected as the latest tag.

I am not sure how much this is a practical problem but I don't think
people would expect this so I started to look for some perhaps better
sorting algorithm.

One idea was to employ a topological sort, i.e. sort tags as they are
encountered by traversing current branch tip backward in Git history.
This would solve the above problem with tagging past commits but when
there are multiple tags on the same commit another criterion and
probably time-based one would be needed to determine order between
those. In addition, pure topological sort of tags is kind of clumsy to
do from the Git command-line.

So I decided to to use git for-each-ref --sort='-taggerdate'
--sort='-*committerdate' - i.e. use date of tagged commit as the
primary criterion and creation of tag as the secondary criterion. This
emulates topological sort given that people have the correct time set
on their machines. Regrettably, there is a bug in git (which was very
quickly fixed https://public-inbox.org/git/CAGqZTUvaiDQbiQ1dOoqLcy+GHZg+BuXY=Z+S=Dpsq=wm44dGaQ@xxxxxxxxxxxxxx/T/#t)
that causes -taggerdate not to be taken into account.

I was able to work-around it by using git for-each-ref --format to
print the relevant timestamps and then use sort command-line utility
to sort the tags. The problem with this is that it has "only" 1-second
resolution and therefore the order of tags created within one second
is indeterminate. It's therefore not 100% predictable. And also it is
time-based, so if someone has a wrong time set on his/her machine, it
might not work.

So finally, I started to think about sorting the tags by RPM sort.
Their names have N-V-R format so it kind of makes sense to use rpm to
sort them. You could tag into past and that tag will be considered the
latest one only if it has the highest N-V-R.

What do you think? What is the best sorting approach here?

Thanks
clime
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux