Re: [PATCH v3] vfs: fix a bug when we do some dio reads with append dio writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Mon, 2013-12-16 at 11:01 +0100, Jan Kara wrote:
> On Mon 16-12-13 09:37:34, Steven Whitehouse wrote:
> > Hi,
> > 
> > On Sat, 2013-12-07 at 18:55 +0800, Zheng Liu wrote:
> > > From: Zheng Liu <wenqing.lz@xxxxxxxxxx>
> > > 
> > > Currently we check i_size in buffered read path after we know the page
> > > is update.  But it could return some zero-filled pages to the userspace
> > > when we do some append dio writes.  We could use the following code
> > > snippet to reproduce this problem in a ext2/3/4 filesystem.
> > > 
> > If the page is not uptodate, then neither (potentially) is i_size, since
> > the underlying fs has not had a chance to get its own locks and update
> > the inode size.
>   Hum, this isn't really about page being uptodate or not, is it? It is
> more about the fact that gfs2 updates i_size only from gfs2_readpage()
> function when obtaining inode lock. Or do you know about other filesystem
> having the same problem as gfs2? E.g. ocfs2 updates i_size from
> ocfs2_file_aio_read() before calling generic_file_aio_read(), cannot gfs2
> do something similar?
> 
Well we could do that, but I don't think it makes sense really. The
point of having the page cache is to put it "in front" of the expensive
operations where possible (such as getting cluster locks). I don't think
it makes sense to grab and drop a cluster lock on every call to
generic_file_aio_read() when, for cached pages, we would not even need
to call ->readpage.

The problem appears to be that somehow we are getting a page returned
when it should not be. So I suspect that the correct place to catch that
is in ->readpage(s)

> > I suspect that the correct fix would be to implement ->launder_page to
> > avoid the race that you've identified here, if I've understood it
> > correctly,
>   I don't understand how that would work. Can you elaborate a bit? Here the
> problem is i_size gets extended by DIO write during buffered read (which is
> a fallback from DIO read) so we return zero-filled page instead of either
> data filled page or short read. I don't see where launder_page() would come
> into the picture...
> 
> 								Honza

Ah, sorry. I was thinking the other way around wrt to
read/write :( However, I still don't think that generic_file_aio_read()
is the right place to fix this. I note that XFS seems to pass the test
and it also uses mpage_readpage and mpage_readpages as well as
generic_file_aio_read() so maybe that is a clue as to where the answer
lies. GFS2 seems to fail the test in the same way as ext3 at the moment.

The other question that we've not really answered is why it is useful to
mix buffered and direct i/o to the same file at the same time anyway.
While I agree that it ought to work, I'm not convinced that its
enormously useful, which is maybe why it has not been spotted before,

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux