Re: [PATCH v3] strverscmp.3: this is NOT the ordering used by ls -v

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi nab,

On Mon, Dec 16, 2024 at 02:00:45AM +0100, наб wrote:
> On Sun, Dec 15, 2024 at 10:44:26PM +0100, Alejandro Colomar wrote:
> > On Sun, Dec 15, 2024 at 10:02:42PM +0100, наб wrote:
> > > > Should we file a bug against glibc strverscmp(3)?  We probably should.
> > > > 
> > > > And the reference to sort(1), I'd put it in BUGS, saying that this API
> > > > is broken, and does not sort properly.  Sounds good?
> > > No, this API works as-documented, and the implementation is useful.
> > What does useful mean?
> There are applications where a lexicographical-except-numeric comparison
> like this is what you want (it's most of them). Calling it a "version
> sort is silly + goofy but, whatever.

Hmmm, yeah, we can live with that for historical raisins.

> > > It's just not what ls -v does.
> > While version sort isn't something standard, I think GNU should be
> > self-consistent.
> It is, ls -v and sort -V are consistent.
> Having just implemented the /actual/ algorithm they use for voreutils,
> that is by far /not/ universally applicable, much hairier, and hard-tuned for
> "versions that are kinda like debian describes and sorts them (but not actually)
>  AND ALSO we put them in filenames where we can assume the format a little bit
>  AND ALSO {4 special cases to make ls -v work}".
> Replacing this well-defined lexicographical-except-numeric sorter with... that,
> isn't really applicable.

Sounds reasonable.

> 
> Best,
> -- >8 --
> From: =?UTF-8?q?Ahelenia=20Ziemia=C5=84ska?=
>  <nabijaczleweli@xxxxxxxxxxxxxxxxxx>
> Subject: [PATCH v3] strverscmp.3: this is NOT the ordering used by ls -v
> 
> Compare, given:
> 	#include <stdlib.h>
> 	#include <stdio.h>
> 	#include <string.h>
> 	int compar(const char **l, const char **r) {
> 		return strverscmp(*l, *r);
> 	}
> 	int main(int argc, char ** argv) {
> 		qsort(argv + 1, argc - 1, sizeof(*argv), compar);
> 		for(int i = 1; i <  argc; ++i)
> 			puts(argv[i]);
> 	}
> yields:
> 	$ /bin/ls -v1 a*  # coreutils ls
> 	a-1.0a
> 	a-1.0.1a
> 	$ ../vers a*      # as above
> 	a-1.0.1a
> 	a-1.0a
> 	$ ls -v1 a*       # voreutils ls @ 5781698 with strverscmp()-equivalent sorting
> 	a-1.0.1a
> 	a-1.0a
> compare also the results for real data like
> 	netstat-nat-1.{0,1{,.1},2,3.1,4{,.{1,2,3,4,5,6,7,8,9,10}}}.tar.gz
> 
> Thus, coreutils ls -v does NOT use strverscmp(3);
> it uses a modified Debian version comparison algorithm with additional
> suffix processing and ls -v-specific exceptions.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@xxxxxxxxxxxxxxxxxx>

Patch applied.  Thanks!

Have a lovely day!
Alex

> ---
>  man/man3/strverscmp.3 | 23 ++++++++---------------
>  1 file changed, 8 insertions(+), 15 deletions(-)
> 
> diff --git a/man/man3/strverscmp.3 b/man/man3/strverscmp.3
> index 41bc1ddbd..e028d6788 100644
> --- a/man/man3/strverscmp.3
> +++ b/man/man3/strverscmp.3
> @@ -18,25 +18,14 @@ .SH SYNOPSIS
>  .BI "int strverscmp(const char *" s1 ", const char *" s2 );
>  .fi
>  .SH DESCRIPTION
> -Often one has files
> +For a dataset like
>  .IR jan1 ", " jan2 ", ..., " jan9 ", " jan10 ", ..."
> -and it feels wrong when
> -.BR ls (1)
> -orders them
> +sorting it lexicographically yields
>  .IR jan1 ", " jan10 ", ..., " jan2 ", ..., " jan9 .
>  .\" classical solution: "rename jan jan0 jan?"
> -In order to rectify this, GNU introduced the
> -.I \-v
> -option to
> -.BR ls (1),
> -which is implemented using
> -.BR versionsort (3),
> -which again uses
> -.BR strverscmp ().
> -.P
> -Thus, the task of
> +The task of
>  .BR strverscmp ()
> -is to compare two strings and find the "right" order, while
> +is to compare two strings yielding the former order, while
>  .BR strcmp (3)
>  finds only the lexicographic order.
>  This function does not use
> @@ -44,6 +33,10 @@ .SH DESCRIPTION
>  .BR LC_COLLATE ,
>  so is meant mostly for situations
>  where the strings are expected to be in ASCII.
> +This is different from the ordering produced by
> +.BR sort (1)
> +.BR -V .
> +.\" sort -V sorts a-1.0a < a-1.0.1a; strverscmp() does not
>  .P
>  What this function does is the following.
>  If both strings are equal, return 0.
> -- 
> 2.39.5
> 



-- 
<https://www.alejandro-colomar.es/>

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux