On Sun, Dec 15, 2024 at 09:43:58PM +0100, Alejandro Colomar wrote: > On Sun, Dec 15, 2024 at 09:17:59PM +0100, Ahelenia Ziemiańska wrote: > > Compare, given: > > #include <stdlib.h> > > #include <stdio.h> > > #include <string.h> > > int compar(const char **l, const char **r) { > > return strverscmp(*l, *r); > > } > > int main(int argc, char ** argv) { > > qsort(argv + 1, argc - 1, sizeof(*argv), compar); > > for(int i = 1; i < argc; ++i) > > puts(argv[i]); > > } > > yields: > > $ /bin/ls -v1 a* # coreutils ls > > a-1.0a > > a-1.0.1a > > $ ../vers a* # as above > > a-1.0.1a > > a-1.0a > > $ ls -v1 a* # voreutils ls @ 5781698 with strverscmp()-equivalent sorting > > a-1.0.1a > > a-1.0a > Should we file a bug against glibc strverscmp(3)? We probably should. > > And the reference to sort(1), I'd put it in BUGS, saying that this API > is broken, and does not sort properly. Sounds good? No, this API works as-documented, and the implementation is useful. It's just not what ls -v does. > > @@ -44,6 +35,10 @@ .SH DESCRIPTION > > .BR LC_COLLATE , > > so is meant mostly for situations > > where the strings are expected to be in ASCII. > > +This is not actually the ordering produced by > > +.BR ls (1) > > +.BR -v . > > +.\" because it considers a-1.0.1a < a-1.0a; this is not what you want > Please refer to sort(1) instead. I would wipe any references to file > names in this page, as I don't think they are relevant at all. Applied in scissor-patch, below Best, -- >8 -- From: =?UTF-8?q?Ahelenia=20Ziemia=C5=84ska?= <nabijaczleweli@xxxxxxxxxxxxxxxxxx> Subject: [PATCH] strverscmp.3: this is NOT the ordering used by ls -v Compare, given: #include <stdlib.h> #include <stdio.h> #include <string.h> int compar(const char **l, const char **r) { return strverscmp(*l, *r); } int main(int argc, char ** argv) { qsort(argv + 1, argc - 1, sizeof(*argv), compar); for(int i = 1; i < argc; ++i) puts(argv[i]); } yields: $ /bin/ls -v1 a* # coreutils ls a-1.0a a-1.0.1a $ ../vers a* # as above a-1.0.1a a-1.0a $ ls -v1 a* # voreutils ls @ 5781698 with strverscmp()-equivalent sorting a-1.0.1a a-1.0a compare also the results for real data like netstat-nat-1.{0,1{,.1},2,3.1,4{,.{1,2,3,4,5,6,7,8,9,10}}}.tar.gz Thus, coreutils ls -v does NOT use strverscmp(3), it uses a similar algorithm that actually properly sorts versions, not just single numbers. Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@xxxxxxxxxxxxxxxxxx> --- man/man3/strverscmp.3 | 23 ++++++++--------------- 1 file changed, 8 insertions(+), 15 deletions(-) diff --git a/man/man3/strverscmp.3 b/man/man3/strverscmp.3 index 41bc1ddbd..65346410c 100644 --- a/man/man3/strverscmp.3 +++ b/man/man3/strverscmp.3 @@ -18,25 +18,14 @@ .SH SYNOPSIS .BI "int strverscmp(const char *" s1 ", const char *" s2 ); .fi .SH DESCRIPTION -Often one has files +For a dataset like .IR jan1 ", " jan2 ", ..., " jan9 ", " jan10 ", ..." -and it feels wrong when -.BR ls (1) -orders them +sorting it lexicographically yields .IR jan1 ", " jan10 ", ..., " jan2 ", ..., " jan9 . .\" classical solution: "rename jan jan0 jan?" -In order to rectify this, GNU introduced the -.I \-v -option to -.BR ls (1), -which is implemented using -.BR versionsort (3), -which again uses -.BR strverscmp (). -.P -Thus, the task of +The task of .BR strverscmp () -is to compare two strings and find the "right" order, while +is to compare two strings yielding the former order, while .BR strcmp (3) finds only the lexicographic order. This function does not use @@ -44,6 +33,10 @@ .SH DESCRIPTION .BR LC_COLLATE , so is meant mostly for situations where the strings are expected to be in ASCII. +This is different from the ordering produced by +.BR sort (1) +.BR -V . +.\" because it considers a-1.0.1a < a-1.0a; this is not what you want .P What this function does is the following. If both strings are equal, return 0. -- 2.39.5
Attachment:
signature.asc
Description: PGP signature