Hello Mickaël! On Sat, Jun 17, 2023 at 11:47:55AM +0200, Mickaël Salaün wrote: > This subject is not easy, but I think we're reaching a consensus (see my > 3-steps proposal plan below). I answered your questions about the (complex) > interface I proposed, but we should focus on the first step now (your > initial proposal) and get back to the other steps later in another email > thread. Thanks for the review! > On 10/05/2023 21:21, Günther Noack wrote: > > [...] > > Some specific things I don't understand well are: > > [...] Thanks, this all make sense now. 👍 > > Would it not be a more orthogonal API if the "file selection" part > > of the Landlock API and the "policy adding" part for these selected > > files were independent of each other? Then the .device and > > .file_type selection scheme could be used for the existing policies > > as well? > > Both approaches have pros and cons. I propose a new incremental approach > below that starts with the simple case where there is no direct links > between different rule types (only the third step add that). > > > > > * When restricting by dev_t (major and minor number), aren't there use > > cases where a system might have 256 CDROM drives, and you'd need to > > allow-list all of these minor number combinations? > > Indeed, we should be able to just ignore device minors. They device numbers are listed in https://www.kernel.org/doc/Documentation/admin-guide/devices.txt Some major numbers are grab-bags of miscellaneous devices, for example major numbers 10 and 13; 13/0-31 (maj/min) are joysticks, whereas 13/32-62) are mice. Maybe this can be specified as ranges on dev_t, so that it would be possible to specify 13/32-62, without matching the joysticks too? > > I think that it might be a feasible approach to start with the > > LANDLOCK_ACCESS_FS_IOCTL approach and then look at its usage to > > understand whether we see a significant number of programs whose > > sandboxes are too coarse because of this. > > > > If more fine-granular control is needed, we can still put the other > > approach on top, and the additional complexity from > > LANDLOCK_ACCESS_FS_IOCTL that we have to support is not that > > dramatically high. > > > > [...] > > I agree that IOCTLs are a security risk and that we should propose a simple > solution short-term, and maybe a more complete one long-term. > > The main issue with a unique IOCTL access right for a file hierarchy is that > we may not know which device/driver will be the target/server, and hence if > we need to allow some IOCTL for regular files (e.g., fscrypt), we might end > up allowing all IOCTLs. > > Here is a plan to incrementally develop a fine-grained IOCTL access control > in 3 steps: > > 1/ Add a simple IOCTL access right for path_beneath: what you initially > proposed. For systems that already configure nodev mount points, it could be > even more useful (e.g., safely allow IOCTL on /home for fscrypt, and > specific /dev files otherwise). Ack, I'll continue on that implementation then. 👍 > 2/ Create a new type of rule to identify file/device type: > struct landlock_inode_type_attr { > __u64 allowed_access; /* same as for path_beneath */ > __u64 flags; /* bit 0: ignores device minor */ > dev_t device; /* same as stat's st_rdev */ > __u16 file_type; /* same as stat's st_mode & S_IFMT */ > }; > > We'll probably need to differentiate the handled accesses for path_beneath > rules and those for inode_type rules to be more useful. > > One issue with this type of rule is that it could be used as an oracle to > bypass stat restrictions. We could check if such (virtual) action is allowed > without the current domain though. > > > 3/ Add a new type of rule to match IOCTL commands, with a mechanism to tie > this to inode_type rules (because a command ID is relative to a file > type/device), and potentially the same mechanism to tie inode_type rules to > path_beneath rules. > > > Each of this step can be implemented one after the other, and each one is > valuable. What do you think? I think it is a good idea to do it in multiple steps, as I also believe that step 1) already provides value on its own. To make sure we are on the same page, let me paraphrase my understanding here: 1) is what I already sent as RFC, with tests and documentation etc. With 1), callers could allow and deny ioctls on newly opened files, independent of file path. 2) makes dev_t and file_type predicates which can be used to limit file accesses on already opened files. With 2), callers could allow and deny ioctls and other operations also on files which are already opened before enablement, such as the TTY FD inherited from the parent process. 3) would make it possible to restrict individual ioctl commands, depending on the dev_t, the file_type, and possibly the path. Is that accurate? In 2) and 3), I'm still contemplating a bit whether we have alternative approaches, but I believe in any case, it's clear that they can be built on top of each other without much overhead, and for many programs, outright denying ioctl on newly opened files with 1) might be the most straightforward approach which already delivers value. Some notes which I collected on the side: * I'm still worried about the already-opened file descriptors like the TTY, through which it can be easy to escape a sandbox (c.f. https://nvd.nist.gov/vuln/detail/CVE-2017-5226) - I would like to have a solution for that. Step (2) would fix it, but maybe I can find a workaround to at least detect the problem in step (1) already. * I looked at OpenBSD's pledge and unveil. In OpenBSD, the ioctl policy is based on the properties of an already opened file. They have hardcoded their ioctl logic in the kernel, which would be a mistake to do in Linux due to stronger backwards compatibility guarantees. OpenBSD is making the allow/deny decisions based on device major/minor numbers and file type (usually block/char device). I think it is noteworthy that OpenBSD does *not* use the file path to make that decision. I wonder whether we should replicate that in steps (2) and (3). It would make for a simpler, more orthogonal API, where the ioctl policy would become a property of the task, and it would be enforced independently of the existing path-based logic. When callers combine that with the overall ioctl check from (1), it might be flexible enough for practical purposes. * We should not rely on the way that ioctl request numbers are structured, because it is used inconsistently. More background: https://lwn.net/Articles/428140/ (Thank you for pointing out this article, Jeff!) > > > > * Should we introduce a "landlock_fd_rights_limit()" syscall? > > > [...] > > > > > > We should also think about batch operations on FD (see the > > > close_range syscall), for instance to deny all IOCTLs on inherited > > > or received FDs. > > > > Hm, you mean a landlock_fd_rights_limit_range() syscall to limit the > > rights for an entire range of FDs? > > > > I have to admit, I'm not familiar with the real-life use cases of > > close_range(). In most programs I work with, it's difficult to reason > > about their ordering once the program has really started to run. So I > > imagine that close_range() is mostly used to "sanitize" the open file > > descriptors at the start of main(), and you have a similar use case in > > mind for this one as well? > > > > If it's just about closing the range from 0 to 2, I'm not sure it's > > worth adding a custom syscall. :) > > The advantage of this kind of range is to efficiently manage all potential > FDs, and the main use case is to close (or change, see the flags) everything > *except" 0-2 (i.e. 3-~0), and then avoid a lot of (potentially useless) > syscalls. > > The Landlock interface doesn't need to be a syscall. We could just add a new > rule type which could take a FD range and restrict them when calling > landlock_restrict_self(). Something like this: > struct landlock_fd_attr { > __u64 allowed_access; > __u32 first; > __u32 last; > } Ack, I see it a bit better. Let's discuss this topic separately. —Günther -- Sent using Mutt 🐕 Woof Woof