Hi Christian, Thanks again for your suggestions here. I have got more questions on this work. > On Jul 29, 2024, at 6:46 AM, Christian Brauner <brauner@xxxxxxxxxx> wrote: [...] >> I am not sure I follow the suggestion to implement this with >> security_inode_permission()? Could you please share more details about >> this idea? > > Given a path like /bin/gcc-6.9/gcc what that code currently does is: > > * walk down to /bin/gcc-6.9/gcc > * walk up from /bin/gcc-6.9/gcc and then checking xattr labels for: > gcc > gcc-6.9/ > bin/ > / > > That's broken because someone could've done > mv /bin/gcc-6.9/gcc /attack/ and when this walks back and it checks xattrs on > /attack even though the path lookup was for /bin/gcc-6.9. IOW, the > security_file_open() checks have nothing to do with the permission checks that > were done during path lookup. > > Why isn't that logic: > > * walk down to /bin/gcc-6.9/gcc and check for each component: > > security_inode_permission(/) > security_inode_permission(gcc-6.9/) > security_inode_permission(bin/) > security_inode_permission(gcc) > security_file_open(gcc) I am trying to implement this approach. The idea, IIUC, is: 1. For each open/openat, as we walk the path in do_filp_open=>path_openat, check xattr for "/", "gcc-6.9/", "bin/" for all given flags. 2. Save the value of the flag somewhere (for BPF, we can use inode local storage). This is needed, because openat(dfd, ..) will not start from root again. 3. Propagate these flag to children. All the above are done at security_inode_permission. 4. Finally, at security_file_open, check the xattr with the file, which is probably propagated from some parents. Did I get this right? IIUC, there are a few issues with this approach. 1. security_inode_permission takes inode as parameter. However, we need dentry to get the xattr. Shall we change security_inode_permission to take dentry instead? PS: Maybe we should change most/all inode hooks to take dentry instead? 2. There is no easy way to propagate data from parent. Assuming we already changed security_inode_permission to take dentry, we still need some mechanism to look up xattr from the parent, which is probably still something like bpf_dget_parent(). Or maybe we should add another hook with both parent and child dentry as input? 3. Given we save the flag from parents in children's inode local storage, we may consume non-trivial extra memory. BPF inode local storage will be freed as the inode gets freed, so we will not leak any memory or overflow some hash map. However, this will probably increase the memory consumption of inode by a few percents. I think a "walk-up" based approach will not have this problem, as we don't need the extra storage. Of course, this means more xattr lookups in some cases. > > I think that dget_parent() logic also wouldn't make sense for relative path > lookups: > > dfd = open("/bin/gcc-6.9", O_RDONLY | O_DIRECTORY | O_CLOEXEC); > > This walks down to /bin/gcc-6.9 and then walks back up (subject to the > same problem mentioned earlier) and check xattrs for: > > gcc-6.9 > bin/ > / > > then that dfd is passed to openat() to open "gcc": > > fd = openat(dfd, "gcc", O_RDONLY); > > which again walks up to /bin/gcc-6.9 and checks xattrs for: > gcc > gcc-6.9 > bin/ > / > > Which means this code ends up charging relative lookups twice. Even if one > irons that out in the program this encourages really bad patterns. > Path lookup is iterative top down. One can't just randomly walk back up and > assume that's equivalent. I understand that walk-up is not equivalent to walk down. But it is probably accurate enough for some security policies. For example, LSM LandLock uses similar logic in the file_open hook (file security/landlock/fs.c, function is_access_to_paths_allowed). To summary my thoughts here. I think we need: 1. Change security_inode_permission to take dentry instead of inode. 2. Still add bpf_dget_parent. We will use it with security_inode_permission so that we can propagate flags from parents to children. We will need a bpf_dput as well. 3. There are pros and cons with different approaches to implement this policy (tags on directory work for all files in it). We probably need the policy writer to decide with one to use. From BPF's POV, dget_parent is "safe", because it won't crash the system. It may encourage some bad patterns, but it appears to be required in some use cases. Does this make sense? Thanks, Song