Re: Elevated i_writecount doesn't guarantee ->release to be called

Fabian Frederick <fabf@xxxxxxxxx> · Thu, 29 Jan 2015 17:47:56 +0100 (CET)



> On 29 January 2015 at 13:46 Jan Kara <jack@xxxxxxx> wrote:
>
>
>   Changed subject and added linux-fsdevel to CC so that other developers
> read this don't fall into the same trap :).
>
> On Wed 28-01-15 22:45:34, Al Viro wrote:
> > On Wed, Jan 28, 2015 at 01:45:24PM -0800, akpm@xxxxxxxxxxxxxxxxxxxx wrote:
> > > atomic_t i_opencnt was used to free allocation in case there were no more
> > > opens.  This patch replaces affs_file_open by generic_file_open and uses
> > > FMODE_WRITE/i_writecount==1 for the task like other FS.
> >
> >
> > >  affs_file_release(struct inode *inode, struct file *filp)
> > >  {
> > > - pr_debug("release(%lu, %d)\n",
> > > -          inode->i_ino, atomic_read(&AFFS_I(inode)->i_opencnt));
> > > + pr_debug("release(%lu)\n", inode->i_ino);
> > > 
> > > - if (atomic_dec_and_test(&AFFS_I(inode)->i_opencnt)) {
> > > + if ((filp->f_mode & FMODE_WRITE) &&
> > > +     (atomic_read(&inode->i_writecount) == 1)) {
> >
> > I'm not at all convinced that this is correct for affs.  Or for anything
> > else, for that matter.  Look: suppose somebody else is trying to open
> > that sucker with O_TRUNC at that moment and they'd already gotten past
> > get_write_access() in handle_truncate(), only to fail on
> > locks_verify_locked().
> > _That_ open() won't get anywhere near opening the file, so there won't be
> > ->release() for it.  And our ->release() will see ->i_writecount greater
> > than 1, due to get_write_access() done in handle_truncate() and still not
> > balanced by coming put_write_access() in there - we'll call it after the
> > locks_verify_locked() reports failure, but that hasn't happened yet.
> >
> > Similar scenarios can almost certainly be constructed for other calls of
> > get_write_access() as well, but this one is enough to NAK this patch, _and_
> > to make the similar logics in other filesystems very suspicious...
>   Thanks for pointing this out. You made me at look where exactly is
> get_write_access() called and there are even places where we call it
> without having file descriptor at all (e.g.  truncate path). So ext3, ext4,
> udf, and gfs2 are racy. If we race, results aren't that bad (we just keep
> preallocated blocks in the inode) but still it would be nice to fix.
>
> Obviously we could maintain a private writecount in ->open() method but it
> would seem a bit sad to do that for this mostly theoretical issue. Maybe we
> just verify whether preallocation is truncated when evicting inode from
> memory and if not, do it there. It's not perfect but even with current racy
> solution noone noticed in practice.
Note that udf is slightly different ; it checks for i_writecount > 1 not =1
which means it would release the file in scenario described above ...

Regards,
Fabian

>
>                                                               Honza
> --
> Jan Kara <jack@xxxxxxx>
> SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html