On Thu, Dec 10, 2015 at 10:16 AM, Willy Tarreau <w@xxxxxx> wrote: > On Thu, Dec 10, 2015 at 10:05:50AM -0800, Kees Cook wrote: >> On Wed, Dec 9, 2015 at 11:06 PM, Willy Tarreau <w@xxxxxx> wrote: >> > Hi Kees, >> > >> > Why not add a new file flag instead ? >> > >> > Something like this (editing your patch by hand to illustrate) : >> > >> > diff --git a/fs/file_table.c b/fs/file_table.c >> > index ad17e05ebf95..3a7eee76ea90 100644 >> > --- a/fs/file_table.c >> > +++ b/fs/file_table.c >> > @@ -191,6 +191,17 @@ static void __fput(struct file *file) >> > >> > might_sleep(); >> > >> > + /* >> > + * XXX: While avoiding mmap_sem, we've already been written to. >> > + * We must ignore the return value, since we can't reject the >> > + * write. >> > + */ >> > + if (unlikely(file->f_flags & FL_DROP_PRIVS)) { >> > + mutex_lock(&inode->i_mutex); >> > + file_remove_privs(file); >> > + mutex_unlock(&inode->i_mutex); >> > + } >> > + >> > fsnotify_close(file); >> > /* >> > * The function eventpoll_release() should be the first called >> > diff --git a/include/linux/fs.h b/include/linux/fs.h >> > index 3aa514254161..409bd7047e7e 100644 >> > --- a/include/linux/fs.h >> > +++ b/include/linux/fs.h >> > @@ -913,3 +913,4 @@ >> > #define FL_OFDLCK 1024 /* lock is "owned" by struct file */ >> > #define FL_LAYOUT 2048 /* outstanding pNFS layout */ >> > +#define FL_DROP_PRIVS 4096 /* lest something weird decides that 2 is OK */ >> > >> > diff --git a/mm/memory.c b/mm/memory.c >> > index c387430f06c3..08a77e0cf65f 100644 >> > --- a/mm/memory.c >> > +++ b/mm/memory.c >> > @@ -2036,6 +2036,7 @@ static inline int wp_page_reuse(struct mm_struct *mm, >> > >> > if (!page_mkwrite) >> > file_update_time(vma->vm_file); >> > + vma->vm_file->f_flags |= FL_DROP_PRIVS; >> > } >> > >> > return VM_FAULT_WRITE; >> > >> > Willy >> > >> >> Is f_flags safe to write like this without holding a lock? > > Unfortunately I have no idea. I've seen places where it's written without > taking a lock such as in blkdev_open() and I don't think that this one is > called with a lock held. > > The comment in fs.h says that spinlock f_lock is here to protect f_flags > (among others) and that it must not be taken from IRQ context. Thus I'd > think we "just" have to take it to remain safe. That would be just one > spinlock per first write via mmap() to a file, I don't know if that's > reasonable or not :-/ Al, what's the best way forward here? I created a separate flag variable so it could be used effectively write-only, with the read happening only at final fput. -Kees -- Kees Cook Chrome OS & Brillo Security -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html