Re: [PATCH 01/15 v2] mm: add invalidatepage_range address space operation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 4 Sep 2012, Andrew Morton wrote:

> Date: Tue, 4 Sep 2012 16:43:16 -0700
> From: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> To: Lukas Czerner <lczerner@xxxxxxxxxx>
> Cc: linux-fsdevel@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx, tytso@xxxxxxx,
>     hughd@xxxxxxxxxx, linux-mm@xxxxxxxxx
> Subject: Re: [PATCH 01/15 v2] mm: add invalidatepage_range address space
>     operation
> 
> On Fri, 31 Aug 2012 18:21:37 -0400
> Lukas Czerner <lczerner@xxxxxxxxxx> wrote:
> 
> > Currently there is no way to truncate partial page where the end
> > truncate point is not at the end of the page. This is because it was not
> > needed and the functionality was enough for file system truncate
> > operation to work properly. However more file systems now support punch
> > hole feature and it can benefit from mm supporting truncating page just
> > up to the certain point.
> > 
> > Specifically, with this functionality truncate_inode_pages_range() can
> > be changed so it supports truncating partial page at the end of the
> > range (currently it will BUG_ON() if 'end' is not at the end of the
> > page).
> > 
> > This commit add new address space operation invalidatepage_range which
> > allows specifying length of bytes to invalidate, rather than assuming
> > truncate to the end of the page. It also introduce
> > block_invalidatepage_range() and do_invalidatepage)range() functions for
> > exactly the same reason.
> > 
> > The caller does not have to implement both aops (invalidatepage and
> > invalidatepage_range) and the latter is preferred. The old method will be
> > used only if invalidatepage_range is not implemented by the caller.
> > 
> > ...
> >
> > +/**
> > + * do_invalidatepage_range - invalidate range of the page
> > + *
> > + * @page: the page which is affected
> > + * @offset: start of the range to invalidate
> > + * @length: length of the range to invalidate
> > +  */
> > +void do_invalidatepage_range(struct page *page, unsigned int offset,
> > +			     unsigned int length)
> > +{
> > +	void (*invalidatepage_range)(struct page *, unsigned int,
> > +				     unsigned int);
> >  	void (*invalidatepage)(struct page *, unsigned long);
> > +
> > +	/*
> > +	 * Try invalidatepage_range first
> > +	 */
> > +	invalidatepage_range = page->mapping->a_ops->invalidatepage_range;
> > +	if (invalidatepage_range) {
> > +		(*invalidatepage_range)(page, offset, length);
> > +		return;
> > +	}
> > +
> > +	/*
> > +	 * When only invalidatepage is registered length + offset must be
> > +	 * PAGE_CACHE_SIZE
> > +	 */
> >  	invalidatepage = page->mapping->a_ops->invalidatepage;
> > +	if (invalidatepage) {
> > +		BUG_ON(length + offset != PAGE_CACHE_SIZE);
> > +		(*invalidatepage)(page, offset);
> > +	}
> >  #ifdef CONFIG_BLOCK
> > -	if (!invalidatepage)
> > -		invalidatepage = block_invalidatepage;
> > +	if (!invalidatepage_range && !invalidatepage)
> > +		block_invalidatepage_range(page, offset, length);
> >  #endif
> > -	if (invalidatepage)
> > -		(*invalidatepage)(page, offset);
> >  }
> 
> This interface is ...  strange.  If the caller requests a
> non-page-aligned invalidateion against an fs which doesn't implement
> ->invalidatepage_range then the kernel goes BUG.  So the caller must
> know beforehand that the underlying fs _does_ implement
> ->invalidatepage_range.
> 
> For practical purposes, this implies that invalidation of a
> non-page-aligned region will only be performed by fs code, because the
> fs implicitly knows that it implements ->invalidatepage_range.
> 
> However this function isn't exported to modules, so scratch that.
> 
> So how is calling code supposed to determine whether it can actually
> _use_ this interface?

Right now the only place we use ->invalidatepage_range is
do_invalidatepage_range() which is only used in
truncate_inode_pages_range(). Without these patches
truncate_inode_pages_range() throw a BUG() if it gets unaligned
range, so it is file system responsibility to take case about the
alignment, which is currently happening in all file systems unless
there is a bug (like in ocfs2).

So currently callers of truncate_inode_pages_range() know that the
range has to be aligned and with these patches they should know (it
is documented in the function comment after all) that when they want
to pass unaligned range the underlying file system has to implement
->invalidatepage_range().

Now I agree that the only one who will have this information will be
the file system itself. But both truncate_pagecache_range() and
truncate_inode_pages_range() are used from within the file system as
you pointed out earlier, so it does not look like a real problem to
me. But I have to admit that it is a bit strange.

However if we would want to keep ->invalidatepage_range() and
->invalidatepage() completely separate then we would have to have
separate truncate_inode_pages_range() and truncate_pagecache_range()
as well for the separation to actually matter. And IMO this would be
much worse...

As it is now the caller is forced to implement
->invalidatepage_range() if he wants to invalidate unaligned range
by the use of BUG_ON() in the kind of same way we would force him to
implement it if he would like to use the 'new'
truncate_inode_pages_range(), or truncate_pagecache_range().

I am intentionally not mentioning do_invalidatepage_range() since it
currently does not have other users than truncate_inode_pages_range() where
the range may be unaligned.

Thanks!
-Lukas

> 
> 
> Also...  one would obviously like to see the old ->invalidatepage() get
> removed entirely.  But about 20 filesystems implement
> ->invalidatepage() and implementation of ->invalidatepage_range() is
> non-trivial and actually unnecessary.
> 
> So I dunno.  Perhaps we should keep ->invalidatepage() and
> ->invalidatepage_range() completely separate.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux