On Mon, Jan 06, 2025 at 05:14:26PM -0800, Isaac Manjarres wrote: > On Fri, Jan 03, 2025 at 04:03:44PM +0100, Jann Horn wrote: > > On Fri, Jan 3, 2025 at 12:32 AM Isaac J. Manjarres > > <isaacmanjarres@xxxxxxxxxx> wrote: > > > Android currently uses the ashmem driver [1] for creating shared memory > > > regions between processes. Ashmem buffers can initially be mapped with > > > PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the > > > ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the > > > permissions that the buffer can be mapped with. > > > > > > Processes can remove the ability to map ashmem buffers as executable to > > > ensure that those buffers cannot be exploited to run unintended code. > > > > Is there really code out there that first maps an ashmem buffer with > > PROT_EXEC, then uses the ioctl to remove execute permission for future > > mappings? I don't see why anyone would do that. > > Hi Jann, > > Thanks for your feedback and for taking the time to review these > patches! > > Not that I'm aware of. The reason why I made this seal have semantics > where it prevents future executable mappings is because there are > existing applications that allocate an ashmem buffer (default > permissions are RWX), map the buffer as RW, and then restrict > the permissions to just R. > > When the buffer is mapped as RW, do_mmap() unconditionally sets > VM_MAYEXEC on the VMA for the mapping, which means that the mapping > could later be mapped as executable via mprotect(). Therefore, having > the semantics of the seal be that it prevents any executable mappings > would break existing code that has already been released. It would > make transitioning clients to memfd difficult, because to amend that, > the ashmem users would have to first restrict the permissions of the > buffer to be RW, then map it as RW, and then restrict the permissions > again to be just R, which also means an additional system call. You could do something similar to my adjustments to the F_SEAL_WRITE changes that clears VM_MAYEXEC in cases where do_mmap() maps an F_SEAL_EXEC'd without PROT_EXEC. Please note that F_SEAL_EXEC implies: F_SEAL_SHRINK F_SEAL_GROW F_SEAL_WRITE <- important, obviously F_SEAL_FUTURE_WRITE <- also important if (seals & F_SEAL_EXEC && inode->i_mode & 0111) seals |= F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_FUTURE_WRITE; Though interestingly only _after_ the mapping_deny_writable() call is performed which is... odd. Probably worth exploring F_SEAL_EXEC semantics in detail if you haven't already to see how viable this is. > > > > For instance, suppose process A allocates a memfd that is meant to be > > > read and written by itself and another process, call it B. > > > > > > Process A shares the buffer with process B, but process B injects code > > > into the buffer, and compromises process A, such that it makes A map > > > the buffer with PROT_EXEC. This provides an opportunity for process A > > > to run the code that process B injected into the buffer. > > > > > > If process A had the ability to seal the buffer against future > > > executable mappings before sharing the buffer with process B, this > > > attack would not be possible. > > > > I think if you want to enforce such restrictions in a scenario where > > the attacker can already make the target process perform > > semi-arbitrary syscalls, it would probably be more reliable to enforce > > rules on executable mappings with something like SELinux policy and/or > > F_SEAL_EXEC. > > > > For SELinux policy, do you mean to not allow execmem permissions? What > about scenarios where a process wants to use JIT compilation, but > doesn't want memfd data buffers to be executable? My thought was to use > this new seal to have a finer granularity to control what buffers can > be mapped as executable. If not, could you please clarify? > > Also, F_SEAL_EXEC just seals the memfd's current executable permissions, > and doesn't affect the mapping permissions at all. Are you saying that > F_SEAL_EXEC should be extended to cover mappings as well? If so, it is > not clear to me what the semantics of that would be. I need to dig into how this functions, but I'm guessing then it doesn't immediatley enforce anything preventing an existing mapping from executing? Which differs from F_SEAL_WRITE semantics and then makes it seem like it is already acting like an F_SEAL_FUTURE_EXEC in effect? Hm need to dig into this a bit. > > For instance, if a memfd is non-executable and F_SEAL_EXEC is applied, > we can also prevent any executable mappings at that point. I'm not sure > if that's the right thing to do though. For instance, there are shared > object files that don't have executable permissions, but their code > sections should be mapped as executable. So, drawing from that, I'm not > sure if it makes sense to tie the file execution permissions to the > mapping permissions. > > There's also the case where F_SEAL_EXEC is invoked on an executable > memfd. In that case, there doesn't seem to be anything to do from a > mapping perspective since memfds can be mapped as executable by > default? > > > > Android is currently trying to replace ashmem with memfd. However, memfd > > > does not have a provision to permanently remove the ability to map a > > > buffer as executable, and leaves itself open to the type of attack > > > described earlier. However, this should be something that can be > > > achieved via a new file seal. > > > > > > There are known usecases (e.g. CursorWindow [2]) where a process > > > maps a buffer with read/write permissions before restricting the buffer > > > to being mapped as read-only for future mappings. > > > > Here you're talking about write permission, but the patch is about > > execute permission? > > > > Sorry, I used this example about write permission to show why I implemented > the seal with support for preventing future mappings, since the writable > mappings that get created can become executable in the future, as > described later in the commit text. > > > > The resulting VMA from the writable mapping has VM_MAYEXEC set, meaning > > > that mprotect() can change the mapping to be executable. Therefore, > > > implementing the seal similar to F_SEAL_WRITE would not be appropriate, > > > since it would not work with the CursorWindow usecase. This is because > > > the CursorWindow process restricts the mapping permissions to read-only > > > after the writable mapping is created. So, adding a file seal for > > > executable mappings that operates like F_SEAL_WRITE would fail. > > > > > > Therefore, add support for F_SEAL_FUTURE_EXEC, which is handled > > > similarly to F_SEAL_FUTURE_WRITE. This ensures that CursorWindow can > > > continue to create a writable mapping initially, and then restrict the > > > permissions on the buffer to be mappable as read-only by using both > > > F_SEAL_FUTURE_WRITE and F_SEAL_FUTURE_EXEC. After the seal is > > > applied, any calls to mmap() with PROT_EXEC will fail. > > > > > > [1] https://cs.android.com/android/kernel/superproject/+/common-android-mainline:common/drivers/staging/android/ashmem.c > > > [2] https://developer.android.com/reference/android/database/CursorWindow > > > > > > Signed-off-by: Isaac J. Manjarres <isaacmanjarres@xxxxxxxxxx> > > > --- > > > include/uapi/linux/fcntl.h | 1 + > > > mm/memfd.c | 39 +++++++++++++++++++++++++++++++++++++- > > > 2 files changed, 39 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h > > > index 6e6907e63bfc..ef066e524777 100644 > > > --- a/include/uapi/linux/fcntl.h > > > +++ b/include/uapi/linux/fcntl.h > > > @@ -49,6 +49,7 @@ > > > #define F_SEAL_WRITE 0x0008 /* prevent writes */ > > > #define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */ > > > #define F_SEAL_EXEC 0x0020 /* prevent chmod modifying exec bits */ > > > +#define F_SEAL_FUTURE_EXEC 0x0040 /* prevent future executable mappings */ > > > /* (1U << 31) is reserved for signed error codes */ > > > > > > /* > > > diff --git a/mm/memfd.c b/mm/memfd.c > > > index 5f5a23c9051d..cfd62454df5e 100644 > > > --- a/mm/memfd.c > > > +++ b/mm/memfd.c > > > @@ -184,6 +184,7 @@ static unsigned int *memfd_file_seals_ptr(struct file *file) > > > } > > > > > > #define F_ALL_SEALS (F_SEAL_SEAL | \ > > > + F_SEAL_FUTURE_EXEC |\ > > > F_SEAL_EXEC | \ > > > F_SEAL_SHRINK | \ > > > F_SEAL_GROW | \ > > > @@ -357,14 +358,50 @@ static int check_write_seal(unsigned long *vm_flags_ptr) > > > return 0; > > > } > > > > > > +static inline bool is_exec_sealed(unsigned int seals) > > > +{ > > > + return seals & F_SEAL_FUTURE_EXEC; > > > +} > > > + > > > +static int check_exec_seal(unsigned long *vm_flags_ptr) > > > +{ > > > + unsigned long vm_flags = *vm_flags_ptr; > > > + unsigned long mask = vm_flags & (VM_SHARED | VM_EXEC); > > > + > > > + /* Executability is not a concern for private mappings. */ > > > + if (!(mask & VM_SHARED)) > > > + return 0; > > > > Why is it not a concern for private mappings? > > > I didn't consider private mappings since it wasn't clear as to how > they could be a threat to another process. A process can copy the > contents of the buffer into another executable region of memory > and just run it from there? Or are you saying that because it > can do that, is there any value in differentiating between > shared and private mappings? Yes this is my point of view also but I might be missing something. > > Thanks, > Isaac