Re: Why doesn't zap_pte_range() call page_mkwrite()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 08, 2009 at 05:41:32PM +0200, Nick Piggin wrote:
> On Tue, Sep 08, 2009 at 11:30:07AM -0400, Chris Mason wrote:
> > > > As I said, I think I can fix the NFS problem by simply unmapping the
> > > > page inside ->writepage() whenever we know the write request was
> > > > originally set up by a page fault.
> > > 
> > > The biggest outstanding problem we have remaining is get_user_pages.
> > > Callers are only required to hold a ref on the page and then they
> > > can call set_page_dirty at any point after that.
> > > 
> > > I have a half-done patch somewhere to add a put_user_pages, and then
> > > we could probably go from there to pinning the fs metadata (whether
> > > by using the page lock or something else, I don't quite know).
> > 
> > Hi everyone,
> > 
> > Sorry for digging up an old thread, but is there any reason we can't
> > just use page_mkwrite here?  I'd love to get rid of the btrfs code to
> > detect places that use set_page_dirty without a page_mkwrite.
> 
> It is because page_mkwrite must be called before the page is dirtied
> (it may fail, it theoretically may do something crazy with the previous
> clean page data). And in several places I think it gets called from a
> nasty context.
> 
> It hasn't fallen completely off my radar. fsblock has the same issue
> (although I've just been ignoring gup writes into fsblock fs for the
> time being).

Ok, I'll change my detection code a bit then.

> 
> I have a basic idea of what to do... It would be nice to change calling
> convention of get_user_pages and take the page lock. Database people might
> scream, in which case we could only take the page lock for filesystems that
> define ->page_mkwrite (so shared mem segments avoid the overhead). Lock
> ordering might get a bit interesting, but if we can have callers ensure they
> always submit and release partially fulfilled requirests, then we can always
> trylock them.

I think everyone will have page_mkwrite eventually, at least everyone
who the databases will care about ;)

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux