Re: Hole punching and mmap races

Dave Chinner <david@xxxxxxxxxxxxx> · Sat, 9 Jun 2012 09:06:16 +1000

On Fri, Jun 08, 2012 at 11:36:29PM +0200, Jan Kara wrote:
> On Fri 08-06-12 10:57:00, Dave Chinner wrote:
> > On Thu, Jun 07, 2012 at 11:58:35PM +0200, Jan Kara wrote:
> > > On Wed 06-06-12 23:36:16, Dave Chinner wrote:
> > > Also we could implement the common case of locking a range
> > > containing single page by just taking page lock so we save modification of
> > > interval tree in the common case and generally make the tree smaller (of
> > > course, at the cost of somewhat slowing down cases where we want to lock
> > > larger ranges).
> > 
> > That seems like premature optimistion to me, and all the cases I
> > think we need to care about are locking large ranges of the tree.
> > Let's measure what the overhead of tracking everything in a single
> > tree is first so we can then see what needs optimising...
>   Umm, I agree that initially we probably want just to have the mapping
> range lock ability, stick it somewhere to IO path and make things work.
> Then we can look into making it faster / merging with page lock.
> 
> However I disagree we care most about locking large ranges. For all
> buffered IO and all page faults we need to lock a range containing just a
> single page. We cannot lock more due to locking constraints with mmap_sem.

Not sure I understand what that constraint is - I hav ebeen thinking
that the buffered IO range lok would be across the entire IO, not
individual pages.

As it is, if we want to do multipage writes (and we do), we have to
be able to lock a range of the mapping in the buffered IO path at a
time...

> So the places that will lock larger ranges are: direct IO, truncate, punch
> hole. Writeback actually doesn't seem to need any additional protection at
> least as I've sketched out things so far.

AFAICT, writeback needs protection against punching holes, just like
mmap does, because they use the same "avoid truncated pages"
mechanism.

> So single-page ranges matter at least as much as longer ranges. That's why
> I came up with that page lock optimisation and merging...

I agree they are common, but lets measure the overhead first before
trying to optimise/special case certain behaviours....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html