On Wed 09-06-10 17:33:36, Nick Piggin wrote: > On Tue, Jun 01, 2010 at 01:39:37PM +0200, Christoph Hellwig wrote: > > Make sure we check the truncate constraints early on in ->setattr by adding > > those checks to inode_change_ok. Also clean up and document inode_change_ok > > to make this obvious. > > > > As a fallout we don't have to call inode_newsize_ok from simple_setsize and > > simplify it down to a truncate_setsize which doesn't return an error. This > > simplifies a lot of setattr implementations and means we use truncate_setsize > > almost everywhere. Get rid of fat_setsize now that it's trivial and mark > > ext2_setsize static to make the calling convention obvious. > > > > Keep the inode_newsize_ok in vmtruncate for now as all callers need an > > audit for it's removal anyway. > > > > Note: setattr code in ecryptfs doesn't call inode_change_ok at all and > > needs a deeper audit, but that is left for later. > > > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > > > > Index: linux-2.6/fs/attr.c > > =================================================================== > > --- linux-2.6.orig/fs/attr.c 2010-06-01 12:19:47.987277079 +0200 > > +++ linux-2.6/fs/attr.c 2010-06-01 12:48:26.063005672 +0200 > > @@ -14,35 +14,53 @@ > > #include <linux/fcntl.h> > > #include <linux/security.h> > > > > -/* Taken over from the old code... */ > > - > > -/* POSIX UID/GID verification for setting inode attributes. */ > > +/** > > + * inode_change_ok - check if attribute changes to an inode are allowed > > + * @inode: inode to check > > + * @attr: attributes to change > > + * > > + * Check if we are allowed to change the attributes contained in @attr > > + * in the given inode. This includes the normal unix access permission > > + * checks, as well as checks for rlimits and others. > > + * > > + * Should be called as the first thing in ->setattr implementations, > > + * possibly after taking additional locks. > > + */ > > int inode_change_ok(const struct inode *inode, struct iattr *attr) > > { > > - int retval = -EPERM; > > unsigned int ia_valid = attr->ia_valid; > > > > + /* > > + * First check size constraints. These can't be overriden using > > + * ATTR_FORCE. > > + */ > > + if (attr->ia_mode & ATTR_SIZE) { > > + int error = inode_newsize_ok(inode, attr->ia_size); > > + if (error) > > + return error; > > + } > > Hmm, I don't know if we can do this unless you have audited the > filesystems (in which case they should be on the cc list of this > pach). > > The problem is whether the i_size is valid and stable at this > point. And it doesn't help even if you do leave the inode_newsize_ok > check inside the vmtruncate part of the fs if the check incorrectly > fails here. > > ocfs2 performs inode_change_ok outside ocfs2_rw_lock and > ocfs2_inode_lock, and inode_newsize_ok inside; cifs holds i_lock > while checking inode_newsize_ok and updating size; gfs2 inside > gfs2_trans_begin. That's a good point. For all local filesystems I know, holding i_mutex is enough for having stable i_size. But for clustered filesystems it definitely isn't. They have to hold cluster locks to be able to reliably check current i_size (at least OCFS2 does). Looking at what inode_newsize_ok currently does, i_size is only used to decide whether we need to check for rlimit or not. So we could falsely miss this check (other node is truncating the file below new offset)... Hmm, OK, so we really need the cluster lock... BTW: Mark, don't we need the cluster lock also for the permission checks in inode_change_ok? Otherwise we could see: Node1 Node2 chmod("file", 000); truncate("file", 0) inode_change_ok still see old perms -> success And Node1 and Node2 can be fully serialized via some userspace synchronization and still hit this so it's not just a race... Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html