Re: [RFC PATCH 0/2] O_DIRECT locking rework

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 20 Oct 2006 14:32:37 -0400
Chris Mason <chris.mason@xxxxxxxxxx> wrote:

> Hello everyone,
> 
> O_DIRECT locking currently uses a few different per-inode locks to
> prevent races between buffered io and direct io.  This is awkward, and
> sometimes grows races where we expose old data on disk.
> 
> For example, I can't quite see how we protect from an mmap triggered
> writepage from filling a hole in the middle of an O_DIRECT read.
> 
> This patch set changes O_DIRECT to use page locks instead of
> mutex/semaphores.  It looks in the radix tree for pages affected by this
> O_DIRECT read/wrte and locks any pages it finds.
> 
> For any pages not present, a stub page struct is inserted into the
> radix tree.  The page cache routines are changed to either wait on this
> place holder page or ignore it as appropriate.  Place holders are not
> valid pages at all, you can't trust page->index or any other field.
> 
> The first patch introduces these place holder pages.  The second patch
> changes fs/direct-io.c to use them.  Patch #2 needs work,
> direct-io.c:lock_page_range can be made much faster, and it needs to be
> changed to work in chunks instead of pinning down the whole range at
> once.
> 
> But, this is enough for people to comment on the basic idea.  Testing
> has been very light. I'm not sure I've covered all of the buffered vs
> direct races yet.  The main goal of posting now is to talk about the
> place holder pages and possible optimizations.
> 
> For the XFS guys, you probably want to avoid the page locking steps as
> well, a later version will honor that.
> 

Boy, it doesn't do much to simplify the code, does it?  An opportunity for
Linus to again share with us his opinions on direct-io.

Conceptually, sticking locked pages into the mapping is a good thing to do,
because it meshes in with all our existing synchronisation design.


I think the fake placeholder page can be a kernel-wide thing rather than
per-dio?  That would be most desirable because then we have

#define PagePlaceHolder(page)	(page == global_placeholder_page)

which saves a precious page flag.


-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux