On Sun, Dec 15, 2024 at 10:44:26PM +0100, Alejandro Colomar wrote: > On Sun, Dec 15, 2024 at 10:02:42PM +0100, наб wrote: > > > Should we file a bug against glibc strverscmp(3)? We probably should. > > > > > > And the reference to sort(1), I'd put it in BUGS, saying that this API > > > is broken, and does not sort properly. Sounds good? > > No, this API works as-documented, and the implementation is useful. > What does useful mean? There are applications where a lexicographical-except-numeric comparison like this is what you want (it's most of them). Calling it a "version sort is silly + goofy but, whatever. > > It's just not what ls -v does. > While version sort isn't something standard, I think GNU should be > self-consistent. It is, ls -v and sort -V are consistent. Having just implemented the /actual/ algorithm they use for voreutils, that is by far /not/ universally applicable, much hairier, and hard-tuned for "versions that are kinda like debian describes and sorts them (but not actually) AND ALSO we put them in filenames where we can assume the format a little bit AND ALSO {4 special cases to make ls -v work}". Replacing this well-defined lexicographical-except-numeric sorter with... that, isn't really applicable. Best, -- >8 -- From: =?UTF-8?q?Ahelenia=20Ziemia=C5=84ska?= <nabijaczleweli@xxxxxxxxxxxxxxxxxx> Subject: [PATCH v3] strverscmp.3: this is NOT the ordering used by ls -v Compare, given: #include <stdlib.h> #include <stdio.h> #include <string.h> int compar(const char **l, const char **r) { return strverscmp(*l, *r); } int main(int argc, char ** argv) { qsort(argv + 1, argc - 1, sizeof(*argv), compar); for(int i = 1; i < argc; ++i) puts(argv[i]); } yields: $ /bin/ls -v1 a* # coreutils ls a-1.0a a-1.0.1a $ ../vers a* # as above a-1.0.1a a-1.0a $ ls -v1 a* # voreutils ls @ 5781698 with strverscmp()-equivalent sorting a-1.0.1a a-1.0a compare also the results for real data like netstat-nat-1.{0,1{,.1},2,3.1,4{,.{1,2,3,4,5,6,7,8,9,10}}}.tar.gz Thus, coreutils ls -v does NOT use strverscmp(3); it uses a modified Debian version comparison algorithm with additional suffix processing and ls -v-specific exceptions. Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@xxxxxxxxxxxxxxxxxx> --- man/man3/strverscmp.3 | 23 ++++++++--------------- 1 file changed, 8 insertions(+), 15 deletions(-) diff --git a/man/man3/strverscmp.3 b/man/man3/strverscmp.3 index 41bc1ddbd..e028d6788 100644 --- a/man/man3/strverscmp.3 +++ b/man/man3/strverscmp.3 @@ -18,25 +18,14 @@ .SH SYNOPSIS .BI "int strverscmp(const char *" s1 ", const char *" s2 ); .fi .SH DESCRIPTION -Often one has files +For a dataset like .IR jan1 ", " jan2 ", ..., " jan9 ", " jan10 ", ..." -and it feels wrong when -.BR ls (1) -orders them +sorting it lexicographically yields .IR jan1 ", " jan10 ", ..., " jan2 ", ..., " jan9 . .\" classical solution: "rename jan jan0 jan?" -In order to rectify this, GNU introduced the -.I \-v -option to -.BR ls (1), -which is implemented using -.BR versionsort (3), -which again uses -.BR strverscmp (). -.P -Thus, the task of +The task of .BR strverscmp () -is to compare two strings and find the "right" order, while +is to compare two strings yielding the former order, while .BR strcmp (3) finds only the lexicographic order. This function does not use @@ -44,6 +33,10 @@ .SH DESCRIPTION .BR LC_COLLATE , so is meant mostly for situations where the strings are expected to be in ASCII. +This is different from the ordering produced by +.BR sort (1) +.BR -V . +.\" sort -V sorts a-1.0a < a-1.0.1a; strverscmp() does not .P What this function does is the following. If both strings are equal, return 0. -- 2.39.5
Attachment:
signature.asc
Description: PGP signature