On Wed, Jun 07, 2006 at 05:10:25PM -0700, Dave Hansen wrote: > > This is the first really tricky patch in the series. It > elevates the writer count on a mount each time a > non-special file is opened for write. > > This is not completely apparent in the patch because the > two if() conditions in may_open() above the > mnt_want_write() call are, combined, equivalent to > special_file(). > > There is also an elevated count around the vfs_create() > call in open_namei(). The count does not need to be > kept elevated all the way into the may_open() call because > after creation, the write bits of the acc_mode are cleared. > This keeps may_open() from ever failing. Howver, this may > open one potential race where a change from a r/w to a r/o > mount could occur between the mnt_drop_write() and may_open() > allowing a user to obtain a r/w file on what is now a r/w probably means 'read only mount' what about using atomic_dec_and_test() to avoid having such cases (or a lock if required)? best, Herbert > mount. But, this functionality does not yet exist. > > > Signed-off-by: Dave Hansen <haveblue@xxxxxxxxxx> > --- > > lxc-dave/fs/file_table.c | 5 ++++- > lxc-dave/fs/namei.c | 29 ++++++++++++++++++++++++++--- > lxc-dave/ipc/mqueue.c | 3 +++ > 3 files changed, 33 insertions(+), 4 deletions(-) > > diff -puN fs/namei.c~elevate-writers-opens-part1 fs/namei.c > --- lxc/fs/namei.c~elevate-writers-opens-part1 2006-06-07 16:53:20.000000000 -0700 > +++ lxc-dave/fs/namei.c 2006-06-07 16:53:20.000000000 -0700 > @@ -1511,8 +1511,17 @@ int may_open(struct nameidata *nd, int a > return -EACCES; > > flag &= ~O_TRUNC; > - } else if (IS_RDONLY(inode) && (flag & FMODE_WRITE)) > - return -EROFS; > + } else if (flag & FMODE_WRITE) { > + /* > + * effectively: !special_file() > + * balanced by __fput() > + */ > + error = mnt_want_write(nd->mnt); > + if (error) > + return error; > + if (IS_RDONLY(inode)) > + return -EROFS; > + } > /* > * An append-only file must be opened in append mode for writing. > */ > @@ -1642,10 +1651,24 @@ do_last: > if (!path.dentry->d_inode) { > if (!IS_POSIXACL(dir->d_inode)) > mode &= ~current->fs->umask; > - error = vfs_create(dir->d_inode, path.dentry, mode, nd); > + /* > + * this serves dual roles, it makes sure there is no > + * r/o mount, and keeps the write count for what is > + * the newly created file > + */ > + error = mnt_want_write(nd->mnt); > + if (!error) > + error = vfs_create(dir->d_inode, path.dentry, mode, nd); > mutex_unlock(&dir->d_inode->i_mutex); > dput(nd->dentry); > nd->dentry = path.dentry; > + /* > + * Unconditionally drop the write access because > + * the acc_mode=0 set below will keep may_open() > + * from ever failing if there was a r/o mount > + * between here and there > + */ > + mnt_drop_write(nd->mnt); > if (error) > goto exit; > /* Don't check for write permission, don't truncate */ > diff -puN fs/open.c~elevate-writers-opens-part1 fs/open.c > diff -puN include/linux/mount.h~elevate-writers-opens-part1 include/linux/mount.h > diff -puN fs/file_table.c~elevate-writers-opens-part1 fs/file_table.c > --- lxc/fs/file_table.c~elevate-writers-opens-part1 2006-06-07 16:53:20.000000000 -0700 > +++ lxc-dave/fs/file_table.c 2006-06-07 16:53:20.000000000 -0700 > @@ -180,8 +180,11 @@ void fastcall __fput(struct file *file) > if (unlikely(inode->i_cdev != NULL)) > cdev_put(inode->i_cdev); > fops_put(file->f_op); > - if (file->f_mode & FMODE_WRITE) > + if (file->f_mode & FMODE_WRITE) { > put_write_access(inode); > + if(!special_file(inode->i_mode)) > + mnt_drop_write(mnt); > + } > file_kill(file); > file->f_dentry = NULL; > file->f_vfsmnt = NULL; > diff -puN ipc/mqueue.c~elevate-writers-opens-part1 ipc/mqueue.c > --- lxc/ipc/mqueue.c~elevate-writers-opens-part1 2006-06-07 16:53:20.000000000 -0700 > +++ lxc-dave/ipc/mqueue.c 2006-06-07 16:53:20.000000000 -0700 > @@ -679,6 +679,9 @@ asmlinkage long sys_mq_open(const char _ > goto out; > filp = do_open(dentry, oflag); > } else { > + error = mnt_want_write(mqueue_mnt); > + if (error) > + goto out; > filp = do_create(mqueue_mnt->mnt_root, dentry, > oflag, mode, u_attr); > } > _ - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html