On Mon, Nov 23, 2015 at 4:26 AM, Jan Kara <jack@xxxxxxx> wrote: > On Thu 19-11-15 16:10:43, Kees Cook wrote: >> Normally, when a user can modify a file that has setuid or setgid bits, >> those bits are cleared when they are not the file owner or a member of the >> group. This is enforced when using write() directly but not when writing >> to a shared mmap on the file. This could allow the file writer to gain >> privileges by changing the binary without losing the setuid/setgid bits. >> >> Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx> >> Cc: stable@xxxxxxxxxxxxxxx > > So I had another look at this and now I understand why we didn't do it from > the start: > > To call file_remove_privs() safely, we need to hold inode->i_mutex since > that operations is going to modify file mode / extended attributes and > i_mutex protects those. However we cannot get i_mutex in the page fault > path as that ranks above mmap_sem which we hold during the whole page > fault. Ah, I see the notation in __generic_file_write_iter about i_mutex. Should file_remove_privs() get some debug annotation to catch callers that don't hold that mutex? (That would have alerted me much earlier.) > So calling file_remove_privs() when opening the file is probably as good as > it can get. It doesn't catch the case when suid bits / IMA attrs are set > while the file is already open but I don't see easy way around this. I agree with Eric: mmap time seems like the right place. > BTW: This is another example where page fault locking is constraining us > and life would be simpler for filesystems we they get called without > mmap_sem held... > > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR -Kees -- Kees Cook Chrome OS & Brillo Security -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html