On Mon, Aug 19, 2024 at 07:18:40AM GMT, Song Liu wrote: > Hi Christian, > > Thanks again for your suggestions here. I have got more questions on > this work. > > > On Jul 29, 2024, at 6:46 AM, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > [...] > > >> I am not sure I follow the suggestion to implement this with > >> security_inode_permission()? Could you please share more details about > >> this idea? > > > > Given a path like /bin/gcc-6.9/gcc what that code currently does is: > > > > * walk down to /bin/gcc-6.9/gcc > > * walk up from /bin/gcc-6.9/gcc and then checking xattr labels for: > > gcc > > gcc-6.9/ > > bin/ > > / > > > > That's broken because someone could've done > > mv /bin/gcc-6.9/gcc /attack/ and when this walks back and it checks xattrs on > > /attack even though the path lookup was for /bin/gcc-6.9. IOW, the > > security_file_open() checks have nothing to do with the permission checks that > > were done during path lookup. > > > > Why isn't that logic: > > > > * walk down to /bin/gcc-6.9/gcc and check for each component: > > > > security_inode_permission(/) > > security_inode_permission(gcc-6.9/) > > security_inode_permission(bin/) > > security_inode_permission(gcc) > > security_file_open(gcc) > > I am trying to implement this approach. The idea, IIUC, is: > > 1. For each open/openat, as we walk the path in do_filp_open=>path_openat, > check xattr for "/", "gcc-6.9/", "bin/" for all given flags. > 2. Save the value of the flag somewhere (for BPF, we can use inode local > storage). This is needed, because openat(dfd, ..) will not start from > root again. > 3. Propagate these flag to children. All the above are done at > security_inode_permission. > 4. Finally, at security_file_open, check the xattr with the file, which > is probably propagated from some parents. > > Did I get this right? > > IIUC, there are a few issues with this approach. > > 1. security_inode_permission takes inode as parameter. However, we need > dentry to get the xattr. Shall we change security_inode_permission > to take dentry instead? > PS: Maybe we should change most/all inode hooks to take dentry instead? security_inode_permission() is called in generic_permission() which in turn is called from inode_permission() which in turn is called from inode->i_op->permission() for various filesystems. So to make security_inode_permission() take a dentry argument one would need to change all inode->i_op->permission() to take a dentry argument for all filesystems. NAK on that. That's ignoring that it's just plain wrong to pass a dentry to **inode**_permission() or security_**inode**_permission(). It's permissions on the inode, not the dentry. > > 2. There is no easy way to propagate data from parent. Assuming we already > changed security_inode_permission to take dentry, we still need some > mechanism to look up xattr from the parent, which is probably still > something like bpf_dget_parent(). Or maybe we should add another hook > with both parent and child dentry as input? > > 3. Given we save the flag from parents in children's inode local storage, > we may consume non-trivial extra memory. BPF inode local storage will > be freed as the inode gets freed, so we will not leak any memory or > overflow some hash map. However, this will probably increase the > memory consumption of inode by a few percents. I think a "walk-up" > based approach will not have this problem, as we don't need the extra > storage. Of course, this means more xattr lookups in some cases. > > > > > I think that dget_parent() logic also wouldn't make sense for relative path > > lookups: > > > > dfd = open("/bin/gcc-6.9", O_RDONLY | O_DIRECTORY | O_CLOEXEC); > > > > This walks down to /bin/gcc-6.9 and then walks back up (subject to the > > same problem mentioned earlier) and check xattrs for: > > > > gcc-6.9 > > bin/ > > / > > > > then that dfd is passed to openat() to open "gcc": > > > > fd = openat(dfd, "gcc", O_RDONLY); > > > > which again walks up to /bin/gcc-6.9 and checks xattrs for: > > gcc > > gcc-6.9 > > bin/ > > / > > > > Which means this code ends up charging relative lookups twice. Even if one > > irons that out in the program this encourages really bad patterns. > > Path lookup is iterative top down. One can't just randomly walk back up and > > assume that's equivalent. > > I understand that walk-up is not equivalent to walk down. But it is probably > accurate enough for some security policies. For example, LSM LandLock uses > similar logic in the file_open hook (file security/landlock/fs.c, function > is_access_to_paths_allowed). I'm not well-versed in landlock so I'll let Mickaël comment on this with more details but there's very important restrictions and differences here. Landlock expresses security policies with file hierarchies and security_inode_permission() doesn't and cannot have access to that. Landlock is subject to the same problem that the BPF is here. Namely that the VFS permission checking could have been done on a path walk completely different from the path walk that is checked when walking back up from security_file_open(). But because landlock works with a deny-by-default security policy this is ok and it takes overmounts into account etc. > > To summary my thoughts here. I think we need: > > 1. Change security_inode_permission to take dentry instead of inode. Sorry, no. > 2. Still add bpf_dget_parent. We will use it with security_inode_permission > so that we can propagate flags from parents to children. We will need > a bpf_dput as well. > 3. There are pros and cons with different approaches to implement this > policy (tags on directory work for all files in it). We probably need > the policy writer to decide with one to use. From BPF's POV, dget_parent > is "safe", because it won't crash the system. It may encourage some bad > patterns, but it appears to be required in some use cases. You cannot just walk a path upwards and check permissions and assume that this is safe unless you have a clear idea what makes it safe in this scenario. Landlock has afaict. But so far you only have a vague sketch of checking permissions walking upwards and retrieving xattrs without any notion of the problems involved. If you provide a bpf_get_parent() api for userspace to consume you'll end up providing them with an api that is extremly easy to misuse.