> On 29 January 2015 at 13:46 Jan Kara <jack@xxxxxxx> wrote: > > > Changed subject and added linux-fsdevel to CC so that other developers > read this don't fall into the same trap :). > > On Wed 28-01-15 22:45:34, Al Viro wrote: > > On Wed, Jan 28, 2015 at 01:45:24PM -0800, akpm@xxxxxxxxxxxxxxxxxxxx wrote: > > > atomic_t i_opencnt was used to free allocation in case there were no more > > > opens. This patch replaces affs_file_open by generic_file_open and uses > > > FMODE_WRITE/i_writecount==1 for the task like other FS. > > > > > > > affs_file_release(struct inode *inode, struct file *filp) > > > { > > > - pr_debug("release(%lu, %d)\n", > > > - inode->i_ino, atomic_read(&AFFS_I(inode)->i_opencnt)); > > > + pr_debug("release(%lu)\n", inode->i_ino); > > > > > > - if (atomic_dec_and_test(&AFFS_I(inode)->i_opencnt)) { > > > + if ((filp->f_mode & FMODE_WRITE) && > > > + (atomic_read(&inode->i_writecount) == 1)) { > > > > I'm not at all convinced that this is correct for affs. Or for anything > > else, for that matter. Look: suppose somebody else is trying to open > > that sucker with O_TRUNC at that moment and they'd already gotten past > > get_write_access() in handle_truncate(), only to fail on > > locks_verify_locked(). > > _That_ open() won't get anywhere near opening the file, so there won't be > > ->release() for it. And our ->release() will see ->i_writecount greater > > than 1, due to get_write_access() done in handle_truncate() and still not > > balanced by coming put_write_access() in there - we'll call it after the > > locks_verify_locked() reports failure, but that hasn't happened yet. > > > > Similar scenarios can almost certainly be constructed for other calls of > > get_write_access() as well, but this one is enough to NAK this patch, _and_ > > to make the similar logics in other filesystems very suspicious... > Thanks for pointing this out. You made me at look where exactly is > get_write_access() called and there are even places where we call it > without having file descriptor at all (e.g. truncate path). So ext3, ext4, > udf, and gfs2 are racy. If we race, results aren't that bad (we just keep > preallocated blocks in the inode) but still it would be nice to fix. > > Obviously we could maintain a private writecount in ->open() method but it > would seem a bit sad to do that for this mostly theoretical issue. Maybe we > just verify whether preallocation is truncated when evicting inode from > memory and if not, do it there. It's not perfect but even with current racy > solution noone noticed in practice. Note that udf is slightly different ; it checks for i_writecount > 1 not =1 which means it would release the file in scenario described above ... Regards, Fabian > > Honza > -- > Jan Kara <jack@xxxxxxx> > SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html