Re: [rfc][patch] mm: direct io less aggressive syncs and invalidates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 28, 2008 at 05:11:02PM -0400, Jeff Moyer wrote:
> Nick Piggin <npiggin@xxxxxxx> writes:
> 
> > Direct IO can invalidate and sync a lot of pagecache pages in the mapping. A
> > 4K direct IO will actually try to sync and/or invalidate the pagecache of the
> > entire file, for example (which might be many GB or TB large).
> >
> > Improve this by doing range syncs. Also, memory no longer has to be unmapped
> > to catch the dirty bits for syncing, as dirty bits would remain coherent due to
> > dirty mmap accounting.
> >
> > This should fix the immediate DM deadlocks when doing direct IO reads to
> > block device with a mounted filesystem, if only by papering over the problem
> > somewhat rather than addressing the fsync starvation cases. Not that the
> > patch itself is a hack, but for this particular problem it is not really
> > the correct solution IMO. But anyway, this might be more appropriate to go
> > into stable kernels if this DM deadlock is biting users.
> >
> > Yes, I still need to put more time into finishing my pagecache tag based
> > sync solution. Sorry :(
> >
> >
> > ---
> > Index: linux-2.6/mm/filemap.c
> > ===================================================================
> > --- linux-2.6.orig/mm/filemap.c	2008-10-03 11:21:31.000000000 +1000
> > +++ linux-2.6/mm/filemap.c	2008-10-03 12:00:17.000000000 +1000
> > @@ -1304,11 +1304,8 @@ generic_file_aio_read(struct kiocb *iocb
> >  			goto out; /* skip atime */
> >  		size = i_size_read(inode);
> >  		if (pos < size) {
> > -			retval = filemap_write_and_wait(mapping);
> > -			if (!retval) {
> > -				retval = mapping->a_ops->direct_IO(READ, iocb,
> > +			retval = mapping->a_ops->direct_IO(READ, iocb,
> >  							iov, pos, nr_segs);
> > -			}
> 
> So why is it safe to get rid of this?  Can't this result in reading
> stale data from disk?

AFAIKS, __blockdev_direct_IO is doing the same thing for us, when it
encounters a READ. I should have documented this change. This is one
thing I'm not *quite* sure of there  might be a path do the block device
that I haven't considered, and which does not do the sync...

 
> The rest looks good to me.  I ran the aio-dio-regress tests against this
> kernel on a UP machine, and they all passed.  The kernel didn't boot on
> my SMP box, though.  Nick, any chance you could grab that test suite and
> run it on an smp system?
>   http://git.kernel.org/?p=linux/kernel/git/zab/aio-dio-regress.git;a=summary

Yeah I could give that a shot and repost the patch for Andrew in a day or
two. Thanks for looking a it.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux