On Tue, 31 Mar 2009 12:55:16 -0700 (PDT) Sage Weil <sage@xxxxxxxxxxxx> wrote: > On Mon, 30 Mar 2009, Nick Piggin wrote: > > [Fixed linux-mm address. Please reply here] > > > > Hi, > > > > I'd like opinions on this patch (applies on top of the previous > > page_mkwrite fixes in -mm). I was not going to ask to merge it > > immediately however it appears that fsblock is not the only one who > > needs it... > > -- > > > > I want to have the page be protected by page lock between page_mkwrite > > notification to the filesystem, and the actual setting of the page > > dirty. Do this by allowing the filesystem to return a locked page from > > page_mkwrite, and have the page fault code keep it held until after it > > calls set_page_dirty. > > > > I need this in fsblock because I am working to ensure filesystem metadata > > can be correctly allocated and refcounted. This means that page cleaning > > should not require memory allocation. wot? "page cleaning" involves writeout. How can we avoid doing allocations there? > > Without this patch, then for example we could get a concurrent writeout > > after the page_mkwrite (which allocates page metadata required to clean > > it), but before the set_page_dirty. The writeout will clean the page and > > notice that the metadata is now unused and may be deallocated (because > > it appears clean as set_page_dirty hasn't been called yet). So at this > > point the page may be dirtied via the pte without enough metadata to be > > able to write it back. > > > > Sage needs this race closed for ceph, and Trond maybe for NFS. > > I ran a few tests and this fixes the problem for me (although fyi the > patch didn't apply cleanly on top of your previously posted page_mkwrite > prototype change patch, due to some differences in block_page_mkwrite). What is "the problem"? Can we get "the problem"'s description included in the changelog? The patch is fairly ugly, somewhat costly and makes things (even) more complex. Sigh. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html