Hi Jason, On Tue, Jan 21, 2025 at 08:35:20AM -0500, Jason Yundt wrote: > The goal of this new manual page is to help people create programs that > do the right thing even in the face of unusual paths. The information > that I used to create this new manual page came from these sources: > > • <https://unix.stackexchange.com/a/39179/316181> > • <https://sourceware.org/pipermail/libc-help/2024-August/006737.html> > • <https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/ext4/ext4.h?h=v6.12.9#n2288> > • <man:unix(7)> > • <https://unix.stackexchange.com/q/92426/316181> > > Signed-off-by: Jason Yundt <jason@jasonyundt.email> Thanks! I've applied the patch, with some tweaks: <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=5e0b1cb79b88d3a78f60bf85bfd3a76df7c10307> Feel free to send further patches. Have a lovely night! Alex > --- > Here’s what I changed from the previous version: > > • I renamed inbuf to in and outbuf to out. > • I removed the iconv_result variable. > • I aligned and merged the variable declarations as requested. > • I added parentheses to my use of sizeof. > • I removed the leftover if statement. > • I removed some unintentional spaces. > > man/man7/pathname.7 | 152 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 152 insertions(+) > create mode 100644 man/man7/pathname.7 > > diff --git a/man/man7/pathname.7 b/man/man7/pathname.7 > new file mode 100644 > index 000000000..96e0009e1 > --- /dev/null > +++ b/man/man7/pathname.7 > @@ -0,0 +1,152 @@ > +.\" Copyright (C) 2025 Jason Yundt (jason@jasonyundt.email) > +.\" > +.\" SPDX-License-Identifier: Linux-man-pages-copyleft > +.\" > +.TH pathname 7 (date) "Linux man-pages (unreleased)" > +.SH NAME > +pathname, > +filename > +\- > +how pathnames are encoded and interpreted > +.SH DESCRIPTION > +Some system calls allow you to pass a pathname as a parameter. > +When writing code that deals with pathnames, > +there are kernel-space requirements that you must comply with, > +and user-space requirements that you should comply with. > +.P > +The kernel stores pathnames as null-terminated byte sequences. > +The kernel has a few general rules that apply to all pathnames: > +.IP \[bu] 3 > +The last byte in the sequence needs to be a null byte. > +.IP \[bu] > +Any other bytes in the sequence need to be non-null bytes. > +.IP \[bu] > +A 0x2F byte is always interpreted as a directory separator (/) > +and cannot be part of a filename. > +.IP \[bu] > +A pathname can be at most PATH_MAX bytes long. > +PATH_MAX is defined in > +.BR limits.h (0p)\ > +\. > +A pathname that’s longer than PATH_MAX bytes > +can be split into multiple smaller pathnames and opened piecewise using > +.BR openat (2). > +.IP \[bu] > +A filename can be at most a certain number of bytes long. > +The number is filesystem-specific. > +You can get the filename length limit for a currently mounted filesystem > +by passing _PC_NAME_MAX to > +.BR fpathconf (3)\ > +\. > +For maximum portability, programs should be able to handle filenames > +that are as long as the relevant filesystems will allow. > +For maximum portability, programs and users should limit the length > +of their own pathnames to NAME_MAX bytes. > +NAME_MAX is defined in > +.BR limits.h (0p)\ > +\. > +.P > +The kernel also has some rules that only apply in certain situations. > +Here are some examples: > +.IP \[bu] 3 > +Filenames on the ext4 filesystem can be at most 30 bytes long. > +.IP \[bu] > +Filenames on the vfat filesystem cannot a > +0x22, 0x2A, 0x3A, 0x3C, 0x3E, 0x3F, 0x5C or 0x7C byte > +(", *, :, <, >, ?, \ or | in ASCII) > +unless the filesystem was mounted with iocharset set to something unusual. > +.IP \[bu] > +A UNIX domain socket’s sun_path can be at most 108 bytes long (see > +.BR unix (7) > +for details). > +.P > +User space treats pathnames differently. > +User space applications typically expect pathnames to use > +a consistent character encoding. > +For maximum interoperability, programs should use > +.BR nl_langinfo (3) > +to determine the current locale’s codeset. > +Paths should be encoded and decoded using the current locale’s codeset > +in order to help prevent mojibake. > +For maximum interoperability, > +programs and users should also limit > +the characters that they use for their own pathnames to characters in > +.UR https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap03.html#tag_03_265 > +the POSIX Portable Filename Character Set > +.UE . > +.SH EXAMPLES > +The following program demonstrates > +how to ensure that a pathname uses the proper encoding. > +The program starts with a UTF-32 encoded pathname. > +It then calls > +.BR nl_langinfo (3) > +in order to determine what the current locale’s codeset is. > +After that, it uses > +.BR iconv (3) > +to convert the UTF-32 encoded pathname into a locale codeset encoded pathname. > +Finally, the program uses the locale codeset encoded pathname to create > +a file that contains the message “Hello, world!” > +.SS Program source > +.\" SRC BEGIN (pathname_encoding_example.c) > +.EX > +#include <err.h> > +#include <iconv.h> > +#include <langinfo.h> > +#include <locale.h> > +#include <stdio.h> > +#include <stdlib.h> > +#include <uchar.h> > +\& > +#define NELEMS(a) (sizeof(a) / sizeof(a[0])) > +\& > +int > +main(void) > +{ > + char *locale_pathname; > + char *in, *out; > + FILE *fp; > + size_t size; > + size_t inbytes, outbytes; > + iconv_t cd; > + const char32_t utf32_pathname[] = U"example"; > +\& > + if (setlocale(LC_ALL, "") == NULL) > + err(EXIT_FAILURE, "setlocale"); > +\& > + size = NELEMS(utf32_pathname) * MB_CUR_MAX; > + locale_pathname = malloc(size); > + if (locale_pathname == NULL) > + err(EXIT_FAILURE, "malloc"); > +\& > + cd = iconv_open(nl_langinfo(CODESET), "UTF\-32"); > + if (cd == (iconv_t)\-1) > + err(EXIT_FAILURE, "iconv_open"); > +\& > + in = (char *) utf32_pathname; > + inbytes = sizeof(utf32_pathname); > + out = locale_pathname; > + outbytes = size; > + if (iconv(cd, &in, &inbytes, &out, &outbytes) == \-1) > + err(EXIT_FAILURE, "iconv"); > +\& > + if (iconv_close(cd) == \-1) > + err(EXIT_FAILURE, "iconv_close"); > +\& > + fp = fopen(locale_pathname, "w"); > + fputs("Hello, world!\[rs]n", fp); > + if (fclose(fp) == EOF) > + err(EXIT_FAILURE, "fclose"); > +\& > + free(locale_pathname); > + exit(EXIT_SUCCESS); > +} > +.EE > +.\" SRC END > +.SH SEE ALSO > +.BR limits.h (0p), > +.BR open (2), > +.BR fpathconf (3), > +.BR iconv (3), > +.BR nl_langinfo (3), > +.BR path_resolution (7), > +.BR mount (8) > -- > 2.47.1 > -- <https://www.alejandro-colomar.es/>
Attachment:
signature.asc
Description: PGP signature