man/man7/pathname.7: Correct handling of pathnames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jason,

I think the recommendation to use the current locale for handling
pathnames isn't good.

If I use the C locale (and I do have systems with the C locale), then
programs running on that system would corrupt files that go through that
system.  Let's say you send me María.song, and I download it on a system
using the C locale.  Programs would fail to copy the file.

Instead, I think a good recommendation would be to behave in one of the
following ways:

-  Accept only the POSIX Portable Filename Character Set.
-  Assume UTF-8, but reject control characters.
-  Assume UTF-8.
-  Accept anything, but reject control characters.
-  Accept anything, just like the kernel.

The current locale should actively be ignored when handling pathnames.

I've modified the example in the manual page to use a filename that's
non-ASCII, to make it more interesting.  See how it fails:

	alx@devuan:~/tmp/gcc$ cat path.c 
	     #include <err.h>
	     #include <iconv.h>
	     #include <langinfo.h>
	     #include <locale.h>
	     #include <stdio.h>
	     #include <stdlib.h>
	     #include <uchar.h>

	     #define NELEMS(a)  (sizeof(a) / sizeof(a[0]))

	     int
	     main(void)
	     {
		 char      *locale_pathname;
		 char      *in, *out;
		 FILE      *fp;
		 size_t    size;
		 size_t    inbytes, outbytes;
		 iconv_t   cd;
		 char32_t  utf32_pathname[] = U"María";

		 if (setlocale(LC_ALL, "") == NULL)
		     err(EXIT_FAILURE, "setlocale");

		 size = NELEMS(utf32_pathname) * MB_CUR_MAX;
		 locale_pathname = malloc(size);
		 if (locale_pathname == NULL)
		     err(EXIT_FAILURE, "malloc");

		 cd = iconv_open(nl_langinfo(CODESET), "UTF-32");
		 if (cd == (iconv_t)-1)
		     err(EXIT_FAILURE, "iconv_open");

		 in = (char *) utf32_pathname;
		 inbytes = sizeof(utf32_pathname);
		 out = locale_pathname;
		 outbytes = size;
		 if (iconv(cd, &in, &inbytes, &out, &outbytes) == (size_t) -1)
		     err(EXIT_FAILURE, "iconv");

		 if (iconv_close(cd) == -1)
		     err(EXIT_FAILURE, "iconv_close");

		 fp = fopen(locale_pathname, "w");
		 if (fp == NULL)
		     err(EXIT_FAILURE, "fopen");

		 fputs("Hello, world!\n", fp);
		 if (fclose(fp) == EOF)
		     err(EXIT_FAILURE, "fclose");

		 free(locale_pathname);
		 exit(EXIT_SUCCESS);
	     }

	alx@devuan:~/tmp/gcc$ cc -Wall -Wextra path.c 
	alx@devuan:~/tmp/gcc$ ls
	a.out  path.c
	alx@devuan:~/tmp/gcc$ ./a.out ; echo $?
	0
	alx@devuan:~/tmp/gcc$ ls
	María  a.out  path.c
	alx@devuan:~/tmp/gcc$ cat María 
	Hello, world!
	alx@devuan:~/tmp/gcc$ LC_ALL=C ./a.out ; echo $?
	a.out: iconv: Invalid or incomplete multibyte or wide character
	1

What do you think?


Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es/>

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux