A modest proposal regarding pathnames (was: [PATCH v4] man/man7/pathname.7: Add file documenting format of pathnames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alex,

At 2025-01-15T18:20:45+0100, Alejandro Colomar wrote:
> Maybe, since this also discusses filenames, we should use both names:
> 
> 	.SH NAME
> 	filename,
> 	pathname
> 	\-
> 	...
...
> > +.IP \[bu]
> > +A path can be at most 4,096 bytes long.
> 
> For self-consistency, let's use the same term all of the time: either
> path or pathname.  Otherwise, a reader might think they are different
> things.
> 
> For consistency with POSIX, let's say pathname, since that's what POSIX
> uses:
> <https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap03.html#tag_03_254>

One way we've stepped on a rake in Unix terminology, and for no good
reason I've been able to discover, is that we cling to the practice of
referring to two different things as "paths".[1]

* file names, possibly qualified with location information that may be
  absolute or relative to the current working directory ("pathname",
  "absolute path(name)", "relative path(name)"

* a list of the foregoing used to search for command file names or other
  loadable resources that an application thinks likely to exist ("PATH",
  "LD_LIBRARY_PATH", "MANPATH", "PYTHONPATH", "CLASSPATH", etc.)

To state it differently, we are passionately committed to using the term
"path" to refer to objects of significantly distinguishable types, such
as:

  char *
and
  char **.

And since this application doesn't admit general recursion--we don't
ever refer to a single character as a "path", nor to a list of lists of
file names as "path", the usage is corrosive to coherent thought.

I don't have any real hope of reforming this abhorrent practice--
I fear the cement had set good and hard before even POSIX Issue _1_
came out.  (Can I blame "/usr/group"?)

But...in the event the donkey I'm riding has borrowed some of its
genetic material from a vigorous young warhorse (let's call him
"JeanHeyd"), I would:

1.  Reserve the term "path" solely for discussion search paths, such as
    those implemented by "PATH".

2.  Adopt the term "filespec", or "file specification", or even just
    "file name", to refer to a character sequence that locates a file.
    POSIX interfaces and utilities tend strongly to be general in this
    respect, in the sense that anywhere a "basename" (the "final
    component" of a "pathname") is accepted, one that is qualified is
    also accepted, as in an "absolute pathname" or "relative pathname".

    The occasions upon which you want to refuse to traverse outside of a
    directory is rare enough, and specialized enough, that it merits
    case-specific discussion.  These are replete with complication.  Is
    traversing only into a subdirectory of the current working directory
    acceptable?  Should symlinks be followed?  If so, should they be
    permitted to escape the part of the tree descended from the current
    working directory?  Back in the day, about a thousand security
    advisories were issued against FTP servers arising from confused or
    unstated policy here, and the terminology of "pathname" did
    _nothing_ to help resolve them.  (Did that term help create the
    problems by fogging the minds of the application developers?  Who
    knows?)

and

3.  Throw away the term "pathname" entirely.  Banish it.

And yes, I know, POSIXly correct people can claim to "eliminate" this
confusion by interrupting conversations with a raised finger:

"No, no--you don't mean 'path', but 'path_name_'."

In my life I have found that I have sufficient talent for being
simultaneously right and annoying.  I don't need that kind of help.

So--will you ride with me, Sancho?  I mean, Alex?  ;-)

> > +.IP \[bu]
> > +If you want to store a file on a vfat filesystem,
> > +then its filename can’t contain a 0x3A byte (: in ASCII)
> 
> Is that the only one?  I expect there are several characters that are
> not allowed in vfat.

You also can't _end_ a file name with "." (0x2E).  I think there are
other restrictions.  Putting my own music collection on a file system
that I needed to be able to share with Windows boxes, many years ago,
was a tedious exercise in discovering VFAT's irritating limitations.

Regards,
Branden

[1] George Lakoff would probably have something to say about the
    unreasonable persistence of metaphors.  When a technical person
    finds that they can employ a notion familiar as a childhood fairy
    tale--as with Hansel and Gretel ambling through the forest--to win
    claims of comprehension from the audience for their design, they
    cling to it passionately.  In Unix, both kernel- and user-space
    developers did so, and neither yielded, snarling like a pair of
    dogs, one at each end of a femur still slick with gore from a bovine
    carcass.  I admit that I'm impressed that Thompson[2] was fought to
    a draw in this instance.  Unfortunately that outcome was the least
    helpful one for the Unix community.  Either side winning would have
    been better.

[2] Or whoever involved with the Unix kernel refused to yield here.

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux