On Mon, Feb 24, 2020 at 5:05 PM Mickaël Salaün <mic@xxxxxxxxxxx> wrote: > A Landlock object enables to identify a kernel object (e.g. an inode). > A Landlock rule is a set of access rights allowed on an object. Rules > are grouped in rulesets that may be tied to a set of processes (i.e. > subjects) to enforce a scoped access-control (i.e. a domain). > > Because Landlock's goal is to empower any process (especially > unprivileged ones) to sandbox themselves, we can't rely on a system-wide > object identification such as file extended attributes. Indeed, we need > innocuous, composable and modular access-controls. > > The main challenge with this constraints is to identify kernel objects > while this identification is useful (i.e. when a security policy makes > use of this object). But this identification data should be freed once > no policy is using it. This ephemeral tagging should not and may not be > written in the filesystem. We then need to manage the lifetime of a > rule according to the lifetime of its object. To avoid a global lock, > this implementation make use of RCU and counters to safely reference > objects. > > A following commit uses this generic object management for inodes. [...] > diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig > new file mode 100644 > index 000000000000..4a321d5b3f67 > --- /dev/null > +++ b/security/landlock/Kconfig > @@ -0,0 +1,15 @@ > +# SPDX-License-Identifier: GPL-2.0-only > + > +config SECURITY_LANDLOCK > + bool "Landlock support" > + depends on SECURITY > + default n (I think "default n" is implicit?) > + help > + This selects Landlock, a safe sandboxing mechanism. It enables to > + restrict processes on the fly (i.e. enforce an access control policy), > + which can complement seccomp-bpf. The security policy is a set of access > + rights tied to an object, which could be a file, a socket or a process. > + > + See Documentation/security/landlock/ for further information. > + > + If you are unsure how to answer this question, answer N. [...] > diff --git a/security/landlock/object.c b/security/landlock/object.c > new file mode 100644 > index 000000000000..38fbbb108120 > --- /dev/null > +++ b/security/landlock/object.c > @@ -0,0 +1,339 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Landlock LSM - Object and rule management > + * > + * Copyright © 2016-2020 Mickaël Salaün <mic@xxxxxxxxxxx> > + * Copyright © 2018-2020 ANSSI > + * > + * Principles and constraints of the object and rule management: > + * - Do not leak memory. > + * - Try as much as possible to free a memory allocation as soon as it is > + * unused. > + * - Do not use global lock. > + * - Do not charge processes other than the one requesting a Landlock > + * operation. > + */ > + > +#include <linux/bug.h> > +#include <linux/compiler.h> > +#include <linux/compiler_types.h> > +#include <linux/err.h> > +#include <linux/errno.h> > +#include <linux/fs.h> > +#include <linux/kernel.h> > +#include <linux/list.h> > +#include <linux/rbtree.h> > +#include <linux/rcupdate.h> > +#include <linux/refcount.h> > +#include <linux/slab.h> > +#include <linux/spinlock.h> > +#include <linux/workqueue.h> > + > +#include "object.h" > + > +struct landlock_object *landlock_create_object( > + const enum landlock_object_type type, void *underlying_object) > +{ > + struct landlock_object *object; > + > + if (WARN_ON_ONCE(!underlying_object)) > + return NULL; > + object = kzalloc(sizeof(*object), GFP_KERNEL); > + if (!object) > + return NULL; > + refcount_set(&object->usage, 1); > + refcount_set(&object->cleaners, 1); > + spin_lock_init(&object->lock); > + INIT_LIST_HEAD(&object->rules); > + object->type = type; > + WRITE_ONCE(object->underlying_object, underlying_object); `object` is not globally visible at this point, so WRITE_ONCE() is unnecessary. > + return object; > +} > + > +struct landlock_object *landlock_get_object(struct landlock_object *object) > + __acquires(object->usage) > +{ > + __acquire(object->usage); > + /* > + * If @object->usage equal 0, then it will be ignored by writers, and > + * underlying_object->object may be replaced, but this is not an issue > + * for release_object(). > + */ > + if (object && refcount_inc_not_zero(&object->usage)) { > + /* > + * It should not be possible to get a reference to an object if > + * its underlying object is being terminated (e.g. with > + * landlock_release_object()), because an object is only > + * modifiable through such underlying object. This is not the > + * case with landlock_get_object_cleaner(). > + */ > + WARN_ON_ONCE(!READ_ONCE(object->underlying_object)); > + return object; > + } > + return NULL; > +} > + > +static struct landlock_object *get_object_cleaner( > + struct landlock_object *object) > + __acquires(object->cleaners) > +{ > + __acquire(object->cleaners); > + if (object && refcount_inc_not_zero(&object->cleaners)) > + return object; > + return NULL; > +} I don't get this whole "cleaners" thing. Can you give a quick description of why this is necessary, and what benefits it has over a standard refcounting+RCU scheme? I don't immediately see anything that requires this. > +/* > + * There is two cases when an object should be free and the reference to the > + * underlying object should be put: > + * - when the last rule tied to this object is removed, which is handled by > + * landlock_put_rule() and then release_object(); > + * - when the object is being terminated (e.g. no more reference to an inode), > + * which is handled by landlock_put_object(). > + */ > +static void put_object_free(struct landlock_object *object) > + __releases(object->cleaners) > +{ > + __release(object->cleaners); > + if (!refcount_dec_and_test(&object->cleaners)) > + return; > + WARN_ON_ONCE(refcount_read(&object->usage)); > + /* > + * Ensures a safe use of @object in the RCU block from > + * landlock_put_rule(). > + */ > + kfree_rcu(object, rcu_free); > +} > + > +/* > + * Destroys a newly created and useless object. > + */ > +void landlock_drop_object(struct landlock_object *object) > +{ > + if (WARN_ON_ONCE(!refcount_dec_and_test(&object->usage))) > + return; > + __acquire(object->cleaners); > + put_object_free(object); > +} > + > +/* > + * Puts the underlying object (e.g. inode) if it is the first request to > + * release @object, without calling landlock_put_object(). > + * > + * Return true if this call effectively marks @object as released, false > + * otherwise. > + */ > +static bool release_object(struct landlock_object *object) > + __releases(&object->lock) > +{ > + void *underlying_object; > + > + lockdep_assert_held(&object->lock); > + > + underlying_object = xchg(&object->underlying_object, NULL); > + spin_unlock(&object->lock); > + might_sleep(); > + if (!underlying_object) > + return false; > + > + switch (object->type) { > + case LANDLOCK_OBJECT_INODE: > + break; > + default: > + WARN_ON_ONCE(1); > + } > + return true; > +} > + > +static void put_object_cleaner(struct landlock_object *object) > + __releases(object->cleaners) > +{ > + /* Let's try an early lockless check. */ > + if (list_empty(&object->rules) && > + READ_ONCE(object->underlying_object)) { > + /* > + * Puts @object if there is no rule tied to it and the > + * remaining user is the underlying object. This check is > + * atomic because @object->rules and @object->underlying_object > + * are protected by @object->lock. > + */ > + spin_lock(&object->lock); > + if (list_empty(&object->rules) && > + READ_ONCE(object->underlying_object) && > + refcount_dec_if_one(&object->usage)) { > + /* > + * Releases @object, in place of > + * landlock_release_object(). > + * > + * @object is already empty, implying that all its > + * previous rules are already disabled. > + * > + * Unbalance the @object->cleaners counter to reflect > + * the underlying object release. > + */ > + if (!WARN_ON_ONCE(!release_object(object))) { > + __acquire(object->cleaners); > + put_object_free(object); > + } > + } else { > + spin_unlock(&object->lock); > + } > + } > + put_object_free(object); > +} > + > +/* > + * Putting an object is easy when the object is being terminated, but it is > + * much more tricky when the reason is that there is no more rule tied to this > + * object. Indeed, new rules could be added at the same time. > + */ > +void landlock_put_object(struct landlock_object *object) > + __releases(object->usage) > +{ > + struct landlock_object *object_cleaner; > + > + __release(object->usage); > + might_sleep(); > + if (!object) > + return; > + /* > + * Guards against concurrent termination to be able to terminate > + * @object if it is empty and not referenced by another rule-appender > + * other than the underlying object. > + */ > + object_cleaner = get_object_cleaner(object); > + if (WARN_ON_ONCE(!object_cleaner)) { > + __release(object->cleaners); > + return; > + } > + /* > + * Decrements @object->usage and if it reach zero, also decrement > + * @object->cleaners. If both reach zero, then release and free > + * @object. > + */ > + if (refcount_dec_and_test(&object->usage)) { > + struct landlock_rule *rule_walker, *rule_walker2; > + > + spin_lock(&object->lock); > + /* > + * Disables all the rules tied to @object when it is forbidden > + * to add new rule but still allowed to remove them with > + * landlock_put_rule(). This is crucial to be able to safely > + * free a rule according to landlock_rule_is_disabled(). > + */ > + list_for_each_entry_safe(rule_walker, rule_walker2, > + &object->rules, list) > + list_del_rcu(&rule_walker->list); > + > + /* > + * Releases @object if it is not already released (e.g. with > + * landlock_release_object()). > + */ > + release_object(object); > + /* > + * Unbalances the @object->cleaners counter to reflect the > + * underlying object release. > + */ > + __acquire(object->cleaners); > + put_object_free(object); > + } > + put_object_cleaner(object_cleaner); > +} > + > +void landlock_put_rule(struct landlock_object *object, > + struct landlock_rule *rule) > +{ > + if (!rule) > + return; > + WARN_ON_ONCE(!object); > + /* > + * Guards against a concurrent @object self-destruction with > + * landlock_put_object() or put_object_cleaner(). > + */ > + rcu_read_lock(); > + if (landlock_rule_is_disabled(rule)) { > + rcu_read_unlock(); > + if (refcount_dec_and_test(&rule->usage)) > + kfree_rcu(rule, rcu_free); > + return; > + } > + if (refcount_dec_and_test(&rule->usage)) { > + struct landlock_object *safe_object; > + > + /* > + * Now, @rule may still be enabled, or in the process of being > + * untied to @object by put_object_cleaner(). However, we know > + * that @object will not be freed until rcu_read_unlock() and > + * until @object->cleaners reach zero. Furthermore, we may not > + * be the only one willing to free a @rule linked with @object. > + * If we succeed to hold @object with get_object_cleaner(), we > + * know that until put_object_cleaner(), we can safely use > + * @object to remove @rule. > + */ > + safe_object = get_object_cleaner(object); > + rcu_read_unlock(); > + if (!safe_object) { > + __release(safe_object->cleaners); > + /* > + * We can safely free @rule because it is already > + * removed from @object's list. > + */ > + WARN_ON_ONCE(!landlock_rule_is_disabled(rule)); > + kfree_rcu(rule, rcu_free); > + } else { > + spin_lock(&safe_object->lock); > + if (!landlock_rule_is_disabled(rule)) > + list_del(&rule->list); > + spin_unlock(&safe_object->lock); > + kfree_rcu(rule, rcu_free); > + put_object_cleaner(safe_object); > + } > + } else { > + rcu_read_unlock(); > + } > + /* > + * put_object_cleaner() might sleep, but it is only reachable if > + * !landlock_rule_is_disabled(). Therefore, clean_ref() can not sleep. > + */ > + might_sleep(); > +} > + > +void landlock_release_object(struct landlock_object __rcu *rcu_object) > +{ > + struct landlock_object *object; > + > + if (!rcu_object) > + return; > + rcu_read_lock(); > + object = get_object_cleaner(rcu_dereference(rcu_object)); This is not how RCU works. You need the rcu annotation on the access to the data structure member (or global variable) that's actually being accessed. A "struct foo __rcu *foo" argument is essentially always wrong. > +struct landlock_rule { > + struct landlock_access access; > + /* > + * @list: Linked list with other rules tied to the same object, which > + * enable to manage their lifetimes. This is also used to identify if > + * a rule is still valid, thanks to landlock_rule_is_disabled(), which > + * is important in the matching process because the original object > + * address might have been recycled. > + */ > + struct list_head list; > + union { > + /* > + * @usage: Number of rulesets pointing to this rule. This > + * field is never used by RCU readers. > + */ > + refcount_t usage; > + struct rcu_head rcu_free; > + }; > +}; An object that is subject to RCU but whose refcount must not be accessed from RCU context? That seems a weird. > +enum landlock_object_type { > + LANDLOCK_OBJECT_INODE = 1, > +}; > + > +struct landlock_object { > + /* > + * @usage: Main usage counter, used to tie an object to it's underlying > + * object (i.e. create a lifetime) and potentially add new rules. I can't really follow this by reading this patch on its own. As one suggestion to make things at least a bit better, how about documenting here that `usage` always reaches zero before `cleaners` does? > + */ > + refcount_t usage; > + /* > + * @cleaners: Usage counter used to free a rule from @rules (thanks to > + * put_rule()). Enables to get a reference to this object until it > + * really become freed. Cf. put_object(). Maybe add: @usage being non-zero counts as one reference to @cleaners. Once @cleaners has become zero, the object is freed after an RCU grace period. > + */ > + refcount_t cleaners; > + union { > + /* > + * The use of this struct is controlled by @usage and > + * @cleaners, which makes it safe to union it with @rcu_free. > + */ [...] > + struct rcu_head rcu_free; > + }; > +}; [...] > +static inline bool landlock_rule_is_disabled( > + struct landlock_rule *rule) > +{ > + /* > + * Disabling (i.e. unlinking) a landlock_rule is a one-way operation. > + * It is not possible to re-enable such a rule, then there is no need > + * for smp_load_acquire(). > + * > + * LIST_POISON2 is set by list_del() and list_del_rcu(). > + */ > + return !rule || READ_ONCE(rule->list.prev) == LIST_POISON2; You're not allowed to do this, the comment above list_del() states: * Note: list_empty() on entry does not return true after this, the entry is * in an undefined state. If you want to be able to test whether the element is on a list afterwards, use stuff like list_del_init(). > +}