On Wed, May 18, 2016 at 10:20:39AM +0200, Jan Kara wrote: > On Wed 18-05-16 08:58:42, Dave Chinner wrote: > > On Tue, May 17, 2016 at 12:34:57PM -0700, Dan Williams wrote: > > > On Tue, May 17, 2016 at 11:29 AM, Jon Derrick > > > <jonathan.derrick@xxxxxxxxx> wrote: > > > > This patch fixes S_DAX bd_inode i_flag locking to conform to suggested > > > > > > A "fix" implies that its currently broken. I don't see how it is, not > > > until we add an ioctl method or other path that also tries to update > > > the flags outside of blkdev_get() context. So, I don't think this > > > patch stands on its own if you were intending it to be merged > > > separately. > > > > > > > locking rules. It is presumed that S_DAX is the only valid inode flag > > > > for a block device which subscribes to direct-access, and will restore > > > > any previously set flags if direct-access initialization fails. > > > > > > > > This reverts to i_flags behavior prior to > > > > bbab37ddc20bae4709bca8745c128c4f46fe63c5 > > > > by allowing other bd_inode flags when DAX is disabled > > > > > > > > Signed-off-by: Jon Derrick <jonathan.derrick@xxxxxxxxx> > > > > --- > > > > fs/block_dev.c | 31 ++++++++++++++++++++++++++----- > > > > 1 file changed, 26 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/fs/block_dev.c b/fs/block_dev.c > > > > index 20a2c02..d41e37f 100644 > > > > --- a/fs/block_dev.c > > > > +++ b/fs/block_dev.c > > > > @@ -1159,6 +1159,20 @@ void bd_set_size(struct block_device *bdev, loff_t size) > > > > } > > > > EXPORT_SYMBOL(bd_set_size); > > > > > > > > +static void bd_add_dax(struct inode *inode) > > > > +{ > > > > + inode_lock(inode); > > > > + inode->i_flags |= S_DAX; > > > > + inode_unlock(inode); > > > > +} > > > > + > > > > +static void bd_clear_dax(struct inode *inode) > > > > +{ > > > > + inode_lock(inode); > > > > + inode->i_flags &= ~S_DAX; > > > > + inode_unlock(inode); > > > > +} > > > > > > Since this is inode generic should these helpers be prefixed "i_" > > > rather than "bd_"? > > > > Probably not, because in general filesystems are responsible for > > updating i_flags to reflect on-disk inode configuration and that's > > typically done under transaction contexts. e.g. through ioctl > > interfaces to set/clear flags that are stored on disk. As such, > > inode->i_flags is effectively protected by the filesystem specific > > locking heirarchy, not the generic inode_lock(). > > > > e.g. have a look at XFS storing a persistent "DAX-enabled" flag in > > the inode, which can be set/cleared on individual inodes dynamically > > by FS_IOC_FSSETXATTR. The XFS i_flags update function assumes > > exclusive access to the field as it is called under locked > > transaction context. Similar code exists in ext4, btrfs, gfs2, > > etc.... > > So in case of ext4, we actually do use inode_lock() to protect against > racing IOC_SETFLAGS calls. i_flags is a strange mix and there are a few > (like S_DEAD or S_NOSEC) which are not persistent and those get set / > cleared by VFS. Some of those places (e.g. the clearing in > __vfs_setxattr_noperm()) are not really controlled by the filesystem > AFAICT. So when XFS doesn't use inode_lock() to protect i_flags updates it > can race with VFS on i_flags updates. There are several other filesystems with the same problem. It's not actually clear what they rules for i_flags are. The comment above inode_set_flags() is ambiguous at best, and even points this out: * In the long run, i_mutex is overkill, and we should probably look * at using the i_lock spinlock to protect i_flags, and then make sure * it is so documented in include/linux/fs.h and that all code follows * the locking convention!! Perhaps that should be done before the situation gets worse... Cheers Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html