On Sun, Jan 10, 2016 at 06:48:32PM +0300, Konstantin Khlebnikov wrote: > I think this should be done in mmap/mprotect. Code in sys_mmap is trivial. > > In sys_mprotect you can check file_needs_remove_privs() and VM_SHARED > under mmap_sem, then if needed grab reference to struct file from vma and > clear suid after unlocking mmap_sem. Which vma? mprotect(2) can cover more than one mapping... You'd have to play interesting games to collect the set of affected struct file; it _might_ be doable (e.g. by using task_work_add() to have the damn thing trigger on the way to userland), but it would require some care to avoid hitting the same file more than once - it might, after all, be mmapped in more than one process, so racing mprotect() would need to be taken into account. Hell knows - might be doable, but I'm not sure it'll be any prettier. ->f_u.fu_rcuhead.func would need to be zeroed on struct file allocation, and that code would need to * check ->f_u.fu_rcuhead.func; if non-NULL - sod off, nothing to do * lock ->f_lock * recheck (and unlock if we'd lost a race and need to sod off) * get_file() * task_work_add() on ->f_u.fu_rcuhead * drop ->f_lock with task_work_add() callback removing SUID, zero ->fu.fu_rcuhead.func (under ->f_lock) and finally fput(). In principle, that would work; the primitive would be along the lines of "make sure that SUID is removed before return to userland" and both mmap and mprotect would use it. The primitive itself would be in fs/file_table.c, encapsulating the messy details in there. All existing users of ->f_u don't touch it until ->f_count drops to 0, so we are OK to use it here. It obviously should _not_ be used for kernel threads (task_work_add() won't do us any good in those, but then we are not going to do mmap or mprotect there either). Regular writes should *not* use that - they ought to strip SUID directly. Might be worth trying... Any takers? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html