Re: [PATCH v14 22/23] LSM: Add /proc attr entry for full LSM context

Stephen Smalley <sds@xxxxxxxxxxxxx> · Mon, 10 Feb 2020 08:25:50 -0500

On 2/10/20 6:56 AM, Simon McVittie wrote:
On Mon, 03 Feb 2020 at 13:54:45 -0500, Stephen Smalley wrote:
The printable ASCII bit is based on what the dbus maintainer requested in
previous discussions.

I thought in previous discussions, we had come to the conclusion that
I can't assume it's 7-bit ASCII. (If I *can* assume that for this new
API, that's even better.)

To be clear, when I say ASCII I mean a sequence of bytes != '\0' with
their high bit unset (x & 0x7f == x) and the obvious mapping to/from
Unicode (bytes '\1' to '\x7f' represent codepoints U+0001 to U+007F). Is
that the same thing you mean?

I mean the subset of 7-bit ASCII that satisfies isprint() using the "C" 
locale.  That is already true for SELinux with the existing interfaces. 
I can't necessarily speak for the others.

I thought the conclusion we had come to in previous conversations was
that the LSM context is what GLib calls a "bytestring", the same as
filenames and environment variables - an opaque sequence of bytes != '\0',
with no further guarantees, and no specified encoding or mapping to/from
Unicode (most likely some superset of ASCII like UTF-8 or Latin-1,
but nobody knows which one, and they could equally well be some binary
encoding with no Unicode meaning, as long as it avoids '\0').

If I can safely assume that a new kernel <-> user-space API is constrained
to UTF-8 or a UTF-8 subset like ASCII, then I can provide more friendly
APIs for user-space features built over it. If that isn't possible, the
next best thing is a "bytestring" like filenames, environment variables,
and most kernel <-> user-space strings in general.

     smcv