Re: Allowed characters in strings passed to/from device-mapper APIs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

On Sun, 9 Mar 2025, Zack Weinberg wrote:

> I am developing a device-mapper client library in Rust.  The language is
> relevant here because Rust's standard library makes a strong distinction
> between strings that are known to be encoded in UTF-8 and strings whose
> encoding is not known (`str` vs `OsStr`), with the latter being awkward
> to use compared to the former.
> 
> The device-mapper ioctls use a lot of strings, and it's not clear
> to me from either the documentation or the kernel's source code
> what expectations the *kernel* has of the character encoding of these
> strings.  I don't expect the kernel to care about UTF-8 versus
> ISO-8859-n or whatever, but it is plausible to me that at least some
> of the strings used in device-mapper ioctls can be expected always to
> be ASCII, in which case I can safely use `str` for them on the library
> side and it'll be more ergonomic for end users.
> 
> There are, I think, four different kinds of strings used in device-
> mapper ioctls: target names, device names and UUIDs, table parameters,
> and target messages.  For each of these, I would like to ask:
> 
> * Are there any circumstances where non-ASCII bytes *should* be
> used in this type of string *when supplied to the kernel*?

I think there's no reason to use non-ASCII characters in device mapper.

> * Are there any circumstances where a client library should expect
> non-ASCII bytes to *appear* in this type of string *when reported
> back to the client by the kernel*?

If the user creates a device name with non-ascii characters, it will be 
reported as such in the kernel ioctls.

> * If there are any such circumstances (either way), does the kernel
> try to interpret non-ASCII bytes in this type of string as characters
> in a specific encoding, or does it just treat them as an opaque
> token (as is typical e.g. for pathname components)?

The kernel just passes non-ascii characters as opaque tokens. It doesn't 
attempt to interpret them.

But udev mangles non-ascii characters - i.e. if you create
dmsetup create ěščřžýáíé --table '0 3951256 zero'
you get
ls -la /dev/mapper/
'\xc4\x9b\xc5\xa1\xc4\x8d\xc5\x99\xc5\xbe\xc3\xbd\xc3\xa1\xc3\xad\xc3\xa9' -> ../dm-0
ls -al /dev/disk/by-id/
'dm-name-\xc4\x9b\xc5\xa1\xc4\x8d\xc5\x99\xc5\xbe\xc3\xbd\xc3\xa1\xc3\xad\xc3\xa9' -> ../../dm-0

dmsetup table will show
ěščřžýáíé: 0 3951256 zero

dmsetup remove ěščřžýáíé will remove the device

Mikulas

> Thanks,
> zw
> 
> please cc: me directly on all responses, I'm not subscribed to the list
> 

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux