On Mon, 03 Feb 2020 at 13:54:45 -0500, Stephen Smalley wrote: > The printable ASCII bit is based on what the dbus maintainer requested in > previous discussions. I thought in previous discussions, we had come to the conclusion that I can't assume it's 7-bit ASCII. (If I *can* assume that for this new API, that's even better.) To be clear, when I say ASCII I mean a sequence of bytes != '\0' with their high bit unset (x & 0x7f == x) and the obvious mapping to/from Unicode (bytes '\1' to '\x7f' represent codepoints U+0001 to U+007F). Is that the same thing you mean? I thought the conclusion we had come to in previous conversations was that the LSM context is what GLib calls a "bytestring", the same as filenames and environment variables - an opaque sequence of bytes != '\0', with no further guarantees, and no specified encoding or mapping to/from Unicode (most likely some superset of ASCII like UTF-8 or Latin-1, but nobody knows which one, and they could equally well be some binary encoding with no Unicode meaning, as long as it avoids '\0'). If I can safely assume that a new kernel <-> user-space API is constrained to UTF-8 or a UTF-8 subset like ASCII, then I can provide more friendly APIs for user-space features built over it. If that isn't possible, the next best thing is a "bytestring" like filenames, environment variables, and most kernel <-> user-space strings in general. smcv