Re: man/man7/pathname.7: Correct handling of pathnames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 27, 2025 at 04:53:10PM +0100, Alejandro Colomar wrote:
> Right.  But then, when do you need to do encoding?

Personally, my preference is that programs use the locale’s codeset
because I can override the locale codeset in the rare event that UTF-8
isn’t the correct option.  In my previous example, I was able to set the
LANG environment variable to jp_JP.SJIS so that I could run that old
software in an environment where pathnames were encoded in Shift-JIS.
If everything just always assumed a particular character encoding for
pathnames, then I wouldn’t have been able to do that.

That being said, I still don’t really know if that’s the best option.

> Programs will either receive the pathname from the command line, or
> read it from some file, or create one of its own.
> 
> When creating a path of its own, it should restrict itself to the
> Portable Filename Character Set, so encoding shouldn't be a problem.
> 
> When reading pathnames, they'll already be encoded suitably.
> 
> > > Instead, I think a good recommendation would be to behave in one of the
> > > following ways:
> > > 
> > > -  Accept only the POSIX Portable Filename Character Set.
> > 
> > This one isn’t quite a complete recommendation.  The POSIX Portable
> > Filename Character Set is just a character set.  It’s not a character
> > encoding.  If we go with this one, then we would need to say something
> > along the lines of “Encode and decode paths using ASCII and only accept
> > characters that are in the POSIX Protable Filename Character Set.”
> > 
> > > -  Assume UTF-8, but reject control characters.
> > > -  Assume UTF-8.
> > 
> > > -  Accept anything, but reject control characters.
> > > -  Accept anything, just like the kernel.
> > 
> > These last two also aren’t quite complete recommendations.  If a GUI
> > program wants to display a pathname on the screen, then what character
> > encoding should it use when decoding the bytes?
> 
> Just print them as they got in.  No decoding.  Send the raw bytes to
> write(2) or printf(3) or whatever.

I don’t think that printing is a good way for GUI applications to
display text.  I don’t normally run GUI applications in a terminal, so
I’m not normally able to see a GUI application’s stdout or stderr.  Most
of the GUI applications that I use display pathnames as part of a larger
window.  In order to do that, the GUI application needs to know which
characters the bytes in the pathname represent so that the GUI
application can draw those characters on the screen.




[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux