On 2/10/20 8:25 AM, Stephen Smalley wrote:
On 2/10/20 6:56 AM, Simon McVittie wrote:
On Mon, 03 Feb 2020 at 13:54:45 -0500, Stephen Smalley wrote:
The printable ASCII bit is based on what the dbus maintainer
requested in
previous discussions.
I thought in previous discussions, we had come to the conclusion that
I can't assume it's 7-bit ASCII. (If I *can* assume that for this new
API, that's even better.)
To be clear, when I say ASCII I mean a sequence of bytes != '\0' with
their high bit unset (x & 0x7f == x) and the obvious mapping to/from
Unicode (bytes '\1' to '\x7f' represent codepoints U+0001 to U+007F). Is
that the same thing you mean?
I mean the subset of 7-bit ASCII that satisfies isprint() using the "C"
locale. That is already true for SELinux with the existing interfaces.
I can't necessarily speak for the others.
Looks like Smack labels are similarly restricted, per
Documentation/admin-guide/LSM/Smack.rst. So I guess the only one that
is perhaps unclear is AppArmor, since its labels are typically derived
from pathnames? Can an AppArmor label returned via its getprocattr()
hook be any legal pathname?
I thought the conclusion we had come to in previous conversations was
that the LSM context is what GLib calls a "bytestring", the same as
filenames and environment variables - an opaque sequence of bytes !=
'\0',
with no further guarantees, and no specified encoding or mapping to/from
Unicode (most likely some superset of ASCII like UTF-8 or Latin-1,
but nobody knows which one, and they could equally well be some binary
encoding with no Unicode meaning, as long as it avoids '\0').
If I can safely assume that a new kernel <-> user-space API is
constrained
to UTF-8 or a UTF-8 subset like ASCII, then I can provide more friendly
APIs for user-space features built over it. If that isn't possible, the
next best thing is a "bytestring" like filenames, environment variables,
and most kernel <-> user-space strings in general.
smcv