On 6/11/19 10:21 AM, David Howells wrote:
To see if we can try and make progress on this, can we try and come at this from another angle: what do LSMs *actually* need to do this? And I grant that each LSM might require different things.
I think part of the problem here is that the discussion is too abstract and not dealing with the specifics of the notifications in question. Those details matter.
-~- [A] There are a bunch of things available, some of which may be coincident, depending on the context: (1) The creds of the process that created a watch_queue (ie. opened /dev/watch_queue).
These will be used when checking permissions to open /dev/watch_queue.
(2) The creds of the process that set a watch (ie. called watch_sb, KEYCTL_NOTIFY, ...);
These will be used when checking permissions to set a watch.
(3) The creds of the process that tripped the event (which might be the system).
These will be used when checking permission to perform whatever operation tripped the event (if the event is triggered by a userspace operation).
(4) The security attributes of the object on which the watch was set (uid, gid, mode, labels).
These will be used when checking permissions to set the watch.
(5) The security attributes of the object on which the event was tripped.
These will be used when checking permission to perform whatever operation tripped the event.
(6) The security attributes of all the objects between the object in (5) and the object in (4), assuming we work from (5) towards (4) if the two aren't coincident (WATCH_INFO_RECURSIVE).
Does this apply to anything other than mount notifications? And for mount notifications, isn't the notification actually for a change to the mount namespace, not a change to any file? Hence, the real "object" for events that trigger mount notifications is the mount namespace, right? The watched path is just a way of identifying a subtree of the mount namespace for notifications - it isn't the real object being watched.
At the moment, when post_one_notification() wants to write a notification into a queue, it calls security_post_notification() to ask if it should be allowed to do so. This is passed (1) and (3) above plus the notification record.
Not convinced we need this.
[B] There are a number of places I can usefully potentially add hooks: (a) The point at which a watch queue is created (ie. /dev/watch_queue is opened).
Already covered by existing hooks on opening files.
(b) The point at which a watch is set (ie. watch_sb).
Yes, this requires a hook and corresponding check.
(c) The point at which a notification is generated (ie. an automount point is tripped).
Preferably covered by existing hooks on object accesses that would generate notifications.
(d) The point at which a notification is delivered (ie. we write the message into the queue).
Preferably not needed.
(e) All the points at which we walk over an object in a chain from (c) to find the watch on which we can effect (d) (eg. we walk rootwards from a mountpoint to find watches on a branch in the mount topology).
Not necessary if the real object of mount notifications is the mount namespace and if we do not support recursive notifications on e.g. directories or some other object where the two can truly diverge.
[C] Problems that need to be resolved: (x) Do I need to put a security pointer in struct watch for the active LSM to fill in? If so, I presume this would need passing to security_post_notification().
I don't see why or where it would get used.
(y) What checks should be done on object destruction after final put and what contexts need to be supplied?
IMHO, no.
This one is made all the harder because the creds that are in force when close(), exit(), exec(), dup2(), etc. close a file descriptor might need to be propagated to deferred-fput, which must in turn propagate them to af_unix-cleanup, and thence back to deferred-fput and thence to implicit unmount (dissolve_on_fput()[*]). [*] Though it should be noted that if this happens, the subtree cannot be attached to the root of a namespace. Further, if several processes are sharing a file object, it's not predictable as to which process the final notification will come from. (z) Do intermediate objects, say in a mount topology notification, actually need to be checked against the watcher's creds? For a mount topology notification, would this require calling inode_permission() for each intervening directory?
I don't think so, because the real object is the mount namespace, not the individual directories.
Doing that might be impractical as it would probably have to be done outside of of the RCU read lock and the filesystem ->permission() hooks might want to sleep (to touch disk or talk to a server).