On Tue, Mar 23, 2021 at 4:54 PM Mickaël Salaün <mic@xxxxxxxxxxx> wrote: > On 23/03/2021 01:13, Jann Horn wrote: > > On Tue, Mar 16, 2021 at 9:43 PM Mickaël Salaün <mic@xxxxxxxxxxx> wrote: > >> Using Landlock objects and ruleset, it is possible to tag inodes > >> according to a process's domain. > > [...] > >> +static void release_inode(struct landlock_object *const object) > >> + __releases(object->lock) > >> +{ > >> + struct inode *const inode = object->underobj; > >> + struct super_block *sb; > >> + > >> + if (!inode) { > >> + spin_unlock(&object->lock); > >> + return; > >> + } > >> + > >> + /* > >> + * Protects against concurrent use by hook_sb_delete() of the reference > >> + * to the underlying inode. > >> + */ > >> + object->underobj = NULL; > >> + /* > >> + * Makes sure that if the filesystem is concurrently unmounted, > >> + * hook_sb_delete() will wait for us to finish iput(). > >> + */ > >> + sb = inode->i_sb; > >> + atomic_long_inc(&landlock_superblock(sb)->inode_refs); > >> + spin_unlock(&object->lock); > >> + /* > >> + * Because object->underobj was not NULL, hook_sb_delete() and > >> + * get_inode_object() guarantee that it is safe to reset > >> + * landlock_inode(inode)->object while it is not NULL. It is therefore > >> + * not necessary to lock inode->i_lock. > >> + */ > >> + rcu_assign_pointer(landlock_inode(inode)->object, NULL); > >> + /* > >> + * Now, new rules can safely be tied to @inode with get_inode_object(). > >> + */ > >> + > >> + iput(inode); > >> + if (atomic_long_dec_and_test(&landlock_superblock(sb)->inode_refs)) > >> + wake_up_var(&landlock_superblock(sb)->inode_refs); > >> +} > > [...] > >> +static struct landlock_object *get_inode_object(struct inode *const inode) > >> +{ > >> + struct landlock_object *object, *new_object; > >> + struct landlock_inode_security *inode_sec = landlock_inode(inode); > >> + > >> + rcu_read_lock(); > >> +retry: > >> + object = rcu_dereference(inode_sec->object); > >> + if (object) { > >> + if (likely(refcount_inc_not_zero(&object->usage))) { > >> + rcu_read_unlock(); > >> + return object; > >> + } > >> + /* > >> + * We are racing with release_inode(), the object is going > >> + * away. Wait for release_inode(), then retry. > >> + */ > >> + spin_lock(&object->lock); > >> + spin_unlock(&object->lock); > >> + goto retry; > >> + } > >> + rcu_read_unlock(); > >> + > >> + /* > >> + * If there is no object tied to @inode, then create a new one (without > >> + * holding any locks). > >> + */ > >> + new_object = landlock_create_object(&landlock_fs_underops, inode); > >> + if (IS_ERR(new_object)) > >> + return new_object; > >> + > >> + /* Protects against concurrent get_inode_object() calls. */ > >> + spin_lock(&inode->i_lock); > >> + object = rcu_dereference_protected(inode_sec->object, > >> + lockdep_is_held(&inode->i_lock)); > > > > rcu_dereference_protected() requires that inode_sec->object is not > > concurrently changed, but I think another thread could call > > get_inode_object() while we're in landlock_create_object(), and then > > we could race with the NULL write in release_inode() here? (It > > wouldn't actually be a UAF though because we're not actually accessing > > `object` here.) Or am I missing a lock that prevents this? > > > > In v28 this wasn't an issue because release_inode() was holding > > inode->i_lock (and object->lock) during the NULL store; but in v29 and > > this version the NULL store in release_inode() moved out of the locked > > region. I think you could just move the NULL store in release_inode() > > back up (and maybe add a comment explaining the locking rules for > > landlock_inode(...)->object)? > > > > (Or alternatively you could use rcu_dereference_raw() with a comment > > explaining that the read pointer is only used to check for NULL-ness, > > and that it is guaranteed that the pointer can't change if it is NULL > > and we're holding the lock. But that'd be needlessly complicated, I > > think.) > > To reach rcu_assign_pointer(landlock_inode(inode)->object, NULL) in > release_inode() or in hook_sb_delete(), the > landlock_inode(inode)->object need to be non-NULL, Yes. > which implies that a > call to get_inode_object(inode) either "retry" (because release_inode is > only called by landlock_put_object, which set object->usage to 0) until > it creates a new object, or reuses the existing referenced object (and > increments object->usage). But it can be that landlock_inode(inode)->object only becomes non-NULL after get_inode_object() has checked rcu_dereference(inode_sec->object). > The worse case would be if > get_inode_object(inode) is called just before the > rcu_assign_pointer(landlock_inode(inode)->object, NULL) from > hook_sb_delete(), which would result in an object with a NULL underobj, > which is the expected behavior (and checked by release_inode). The scenario I'm talking about doesn't involve hook_sb_delete(). > The line rcu_assign_pointer(inode_sec->object, new_object) from > get_inode_object() can only be reached if the underlying inode doesn't > reference an object, Yes. > in which case hook_sb_delete() will not reach the > rcu_assign_pointer(landlock_inode(inode)->object, NULL) line for this > same inode. > > This works because get_inode_object(inode) is mutually exclusive to > itself with the same inode (i.e. an inode can only point to an object > that references this same inode). To clarify: You can concurrently call get_inode_object() multiple times on the same inode, right? There are no locks held on entry to that function. > I tried to explain this with the comment "Protects against concurrent > get_inode_object() calls" in get_inode_object(), and the comments just > before both rcu_assign_pointer(landlock_inode(inode)->object, NULL). The scenario I'm talking about is: Initially the inode does not have an associated landlock_object. There are two threads A and B. Thread A is going to execute get_inode_object(). Thread B is going to execute get_inode_object() followed immediately by landlock_put_object(). thread A: enters get_inode_object() thread A: rcu_dereference(inode_sec->object) returns NULL thread A: enters landlock_create_object() thread B: enters get_inode_object() thread B: rcu_dereference(inode_sec->object) returns NULL thread B: calls landlock_create_object() thread B: sets inode_sec->object while holding inode->i_lock thread B: leaves get_inode_object() thread B: enters landlock_put_object() thread B: object->usage drops to 0, object->lock is taken thread B: calls release_inode() thread B: drops object->lock thread A: returns from landlock_create_object() thread A: takes inode->i_lock At this point, thread B will run: rcu_assign_pointer(landlock_inode(inode)->object, NULL); while thread A runs: rcu_dereference_protected(inode_sec->object, lockdep_is_held(&inode->i_lock)); meaning there is a (theoretical) data race, since rcu_dereference_protected() doesn't use READ_ONCE(). > >> + if (unlikely(object)) { > >> + /* Someone else just created the object, bail out and retry. */ > >> + spin_unlock(&inode->i_lock); > >> + kfree(new_object); > >> + > >> + rcu_read_lock(); > >> + goto retry; > >> + } > >> + > >> + rcu_assign_pointer(inode_sec->object, new_object); > >> + /* > >> + * @inode will be released by hook_sb_delete() on its superblock > >> + * shutdown. > >> + */ > >> + ihold(inode); > >> + spin_unlock(&inode->i_lock); > >> + return new_object; > >> +}