This is the first really tricky patch in the series. It elevates the writer count on a mount each time a non-special file is opened for write. This is not completely apparent in the patch because the two if() conditions in may_open() above the mnt_want_write() call are, combined, equivalent to special_file(). There is also an elevated count around the vfs_create() call in open_namei(). The count does not need to be kept elevated all the way into the may_open() call because after creation, the write bits of the acc_mode are cleared. This keeps may_open() from ever failing. Howver, this may open one potential race where a change from a r/w to a r/o mount could occur between the mnt_drop_write() and may_open() allowing a user to obtain a r/w file on what is now a r/w mount. But, this functionality does not yet exist. Signed-off-by: Dave Hansen <haveblue@xxxxxxxxxx> --- lxc-dave/fs/file_table.c | 5 ++++- lxc-dave/fs/namei.c | 29 ++++++++++++++++++++++++++--- lxc-dave/ipc/mqueue.c | 3 +++ 3 files changed, 33 insertions(+), 4 deletions(-) diff -puN fs/namei.c~elevate-writers-opens-part1 fs/namei.c --- lxc/fs/namei.c~elevate-writers-opens-part1 2006-06-07 16:53:20.000000000 -0700 +++ lxc-dave/fs/namei.c 2006-06-07 16:53:20.000000000 -0700 @@ -1511,8 +1511,17 @@ int may_open(struct nameidata *nd, int a return -EACCES; flag &= ~O_TRUNC; - } else if (IS_RDONLY(inode) && (flag & FMODE_WRITE)) - return -EROFS; + } else if (flag & FMODE_WRITE) { + /* + * effectively: !special_file() + * balanced by __fput() + */ + error = mnt_want_write(nd->mnt); + if (error) + return error; + if (IS_RDONLY(inode)) + return -EROFS; + } /* * An append-only file must be opened in append mode for writing. */ @@ -1642,10 +1651,24 @@ do_last: if (!path.dentry->d_inode) { if (!IS_POSIXACL(dir->d_inode)) mode &= ~current->fs->umask; - error = vfs_create(dir->d_inode, path.dentry, mode, nd); + /* + * this serves dual roles, it makes sure there is no + * r/o mount, and keeps the write count for what is + * the newly created file + */ + error = mnt_want_write(nd->mnt); + if (!error) + error = vfs_create(dir->d_inode, path.dentry, mode, nd); mutex_unlock(&dir->d_inode->i_mutex); dput(nd->dentry); nd->dentry = path.dentry; + /* + * Unconditionally drop the write access because + * the acc_mode=0 set below will keep may_open() + * from ever failing if there was a r/o mount + * between here and there + */ + mnt_drop_write(nd->mnt); if (error) goto exit; /* Don't check for write permission, don't truncate */ diff -puN fs/open.c~elevate-writers-opens-part1 fs/open.c diff -puN include/linux/mount.h~elevate-writers-opens-part1 include/linux/mount.h diff -puN fs/file_table.c~elevate-writers-opens-part1 fs/file_table.c --- lxc/fs/file_table.c~elevate-writers-opens-part1 2006-06-07 16:53:20.000000000 -0700 +++ lxc-dave/fs/file_table.c 2006-06-07 16:53:20.000000000 -0700 @@ -180,8 +180,11 @@ void fastcall __fput(struct file *file) if (unlikely(inode->i_cdev != NULL)) cdev_put(inode->i_cdev); fops_put(file->f_op); - if (file->f_mode & FMODE_WRITE) + if (file->f_mode & FMODE_WRITE) { put_write_access(inode); + if(!special_file(inode->i_mode)) + mnt_drop_write(mnt); + } file_kill(file); file->f_dentry = NULL; file->f_vfsmnt = NULL; diff -puN ipc/mqueue.c~elevate-writers-opens-part1 ipc/mqueue.c --- lxc/ipc/mqueue.c~elevate-writers-opens-part1 2006-06-07 16:53:20.000000000 -0700 +++ lxc-dave/ipc/mqueue.c 2006-06-07 16:53:20.000000000 -0700 @@ -679,6 +679,9 @@ asmlinkage long sys_mq_open(const char _ goto out; filp = do_open(dentry, oflag); } else { + error = mnt_want_write(mqueue_mnt); + if (error) + goto out; filp = do_create(mqueue_mnt->mnt_root, dentry, oflag, mode, u_attr); } _ - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html