On Mon, Nov 23, 2020 at 10:16 PM Mickaël Salaün <mic@xxxxxxxxxxx> wrote: > On 23/11/2020 20:44, Jann Horn wrote: > > On Sat, Nov 21, 2020 at 11:06 AM Mickaël Salaün <mic@xxxxxxxxxxx> wrote: > >> On 21/11/2020 08:00, Jann Horn wrote: > >>> On Thu, Nov 12, 2020 at 9:52 PM Mickaël Salaün <mic@xxxxxxxxxxx> wrote: > >>>> Thanks to the Landlock objects and ruleset, it is possible to identify > >>>> inodes according to a process's domain. To enable an unprivileged > >>>> process to express a file hierarchy, it first needs to open a directory > >>>> (or a file) and pass this file descriptor to the kernel through > >>>> landlock_add_rule(2). When checking if a file access request is > >>>> allowed, we walk from the requested dentry to the real root, following > >>>> the different mount layers. The access to each "tagged" inodes are > >>>> collected according to their rule layer level, and ANDed to create > >>>> access to the requested file hierarchy. This makes possible to identify > >>>> a lot of files without tagging every inodes nor modifying the > >>>> filesystem, while still following the view and understanding the user > >>>> has from the filesystem. > >>>> > >>>> Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not > >>>> keep the same struct inodes for the same inodes whereas these inodes are > >>>> in use. > >>>> > >>>> This commit adds a minimal set of supported filesystem access-control > >>>> which doesn't enable to restrict all file-related actions. This is the > >>>> result of multiple discussions to minimize the code of Landlock to ease > >>>> review. Thanks to the Landlock design, extending this access-control > >>>> without breaking user space will not be a problem. Moreover, seccomp > >>>> filters can be used to restrict the use of syscall families which may > >>>> not be currently handled by Landlock. > >>>> > >>>> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> > >>>> Cc: Anton Ivanov <anton.ivanov@xxxxxxxxxxxxxxxxxx> > >>>> Cc: James Morris <jmorris@xxxxxxxxx> > >>>> Cc: Jann Horn <jannh@xxxxxxxxxx> > >>>> Cc: Jeff Dike <jdike@xxxxxxxxxxx> > >>>> Cc: Kees Cook <keescook@xxxxxxxxxxxx> > >>>> Cc: Richard Weinberger <richard@xxxxxx> > >>>> Cc: Serge E. Hallyn <serge@xxxxxxxxxx> > >>>> Signed-off-by: Mickaël Salaün <mic@xxxxxxxxxxxxxxxxxxx> > >>>> --- > >>>> > >>>> Changes since v23: > >>>> * Enforce deterministic interleaved path rules. To have consistent > >>>> layered rules, granting access to a path implies that all accesses > >>>> tied to inodes, from the requested file to the real root, must be > >>>> checked. Otherwise, stacked rules may result to overzealous > >>>> restrictions. By excluding the ability to add exceptions in the same > >>>> layer (e.g. /a allowed, /a/b denied, and /a/b/c allowed), we get > >>>> deterministic interleaved path rules. This removes an optimization > >>> > >>> I don't understand the "deterministic interleaved path rules" part. > >> > >> I explain bellow. > >> > >>> > >>> > >>> What if I have a policy like this? > >>> > >>> /home/user READ > >>> /home/user/Downloads READ+WRITE > >>> > >>> That's a reasonable policy, right? > >> > >> Definitely, I forgot this, thanks for the outside perspective! > >> > >>> > >>> If I then try to open /home/user/Downloads/foo in WRITE mode, the loop > >>> will first check against the READ+WRITE rule for /home/user, that > >>> check will pass, and then it will check against the READ rule for /, > >>> which will deny the access, right? That seems bad. > >> > >> Yes that was the intent. > >> > >>> > >>> > >>> The v22 code ensured that for each layer, the most specific rule (the > >>> first we encounter on the walk) always wins, right? What's the problem > >>> with that? > >> > >> This can be explained with the interleaved_masked_accesses test: > >> https://github.com/landlock-lsm/linux/blob/landlock-v24/tools/testing/selftests/landlock/fs_test.c#L647 > >> > >> In this case there is 4 stacked layers: > >> layer 1: allows s1d1/s1d2/s1d3/file1 > >> layer 2: allows s1d1/s1d2/s1d3 > >> denies s1d1/s1d2 > >> layer 3: allows s1d1 > >> layer 4: allows s1d1/s1d2 > >> > >> In the v23, access to file1 would be allowed until layer 3, but layer 4 > >> would merge a new rule for the s1d2 inode. Because we don't record where > >> exactly the access come from, we can't tell that layer 2 allowed access > >> thanks to s1d3 and that its s1d2 rule was ignored. I think this behavior > >> doesn't make sense from the user point of view. > > > > Aah, I think I'm starting to understand the issue now. Basically, with > > the current UAPI, the semantics have to be "an access is permitted if, > > for each policy layer, at least one rule encountered on the pathwalk > > permits the access; rules that deny the access are irrelevant". And if > > it turns out that someone needs to be able to deny access to specific > > inodes, we'll have to extend struct landlock_path_beneath_attr. > > Right, I'll add this to the documentation (aligned with the new > implementation). > > > > > That reminds me... if we do need to make such a change in the future, > > it would be easier in terms of UAPI compatibility if > > landlock_add_rule() used copy_struct_from_user(), which is designed to > > create backwards and forwards compatibility with other version of UAPI > > headers. So adding that now might save us some headaches later. > > I used copy_struct_from_user() before v21, but Arnd wasn't a fan of > having type and size arguments, so we simplified the UAPI in the v21 by > removing the size argument. The type argument is enough to extend the > structure, but indeed, we lose the forward compatibility. Relying on one > syscall per rule type seems too much, though. You have a point there, I guess having a type argument is enough. (And if userspace tries to load a ruleset with "deny" rules that isn't supported by the current kernel, userspace will have to deal with that in some way anyway.) So thinking about it more, I guess the current version is probably actually fine, too.