Re: [PATCH] man/man7/path-format.7: Add file documenting format of pathnames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jason,

On Mon, Jan 13, 2025 at 04:32:46PM -0500, Jason Yundt wrote:
> The goal of this new manual page is to help people create programs that
> do the right thing even in the face of unusual paths.  The information
> that I used to create this new manual page came from this Unix & Linux
> Stack Exchange answer [1] and from this Libc-help mailing list post [2].
> 
> [1]: <https://unix.stackexchange.com/a/39179/316181>
> [2]: <https://sourceware.org/pipermail/libc-help/2024-August/006737.html>
> 
> Signed-off-by: Jason Yundt <jason@jasonyundt.email>
> ---
>  man/man7/path-format.7 | 41 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
>  create mode 100644 man/man7/path-format.7
> 
> diff --git a/man/man7/path-format.7 b/man/man7/path-format.7
> new file mode 100644
> index 000000000..c3c01cbf5
> --- /dev/null
> +++ b/man/man7/path-format.7
> @@ -0,0 +1,41 @@
> +.\" Copyright (C) 2025 Jason Yundt (jason@jasonyundt.email)
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.TH PATH-FORMAT 7 (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +path-format \- how pathnames are encoded and interpreted

I would use path_format instead of path-format or PATH-FORMAT.

> +.SH DESCRIPTION
> +Some system calls allow you to pass a pathname as a parameter.
> +When writing code that deals with paths,
> +there are kernel space requirements that you must comply with
> +and userspace requirements that you should comply with.
> +.P
> +The kernel stores paths as null-terminated byte sequences.
> +As far as the kernel is concerned, there are only three rules for paths:
> +.IP \[bu]
> +The last byte in the sequence needs to be a null.
> +.IP \[bu]
> +Any other bytes in the sequence need to not be null bytes.

... need to be non-null bytes.

seems easier to read.

> +.IP \[bu]
> +A 0x2F byte is always interpreted as a directory separator (/).
> +.P
> +This means that programs can technically do weird things
> +like create paths using random character encodings
> +or create paths without using any character encoding at all.
> +Filesystems may impose additional restrictions on paths, though.
> +For example, if you want to store a file on an ext4 filesystem,
> +then its filename can’t be longer than 255 bytes.
> +.P
> +Userspace treats paths differently.
> +Userspace applications typically expect paths to use
> +a consistent character encoding.
> +For maximum interoperability, programs should use
> +.BR nl_langinfo (3)
> +to determine the current locale’s codeset.

I would say that for maximum interoperability one should self-limit to
the POSIX Portable Filename Character Set:
<https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap03.html#tag_03_265>


Have a lovely night!
Alex

> +Paths should be encoded and decoded using the current locale’s codeset
> +in order to help prevent mojibake.
> +.SH SEE ALSO
> +.BR open (2),
> +.BR nl_langinfo (3),
> +.BR path_resolution (7)
> -- 
> 2.47.0
> 

-- 
<https://www.alejandro-colomar.es/>

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux