On Wed, Nov 17, 2010 at 3:11 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> Really, my understanding is that not pre-allocating filesystem blocks >> is just fine. This is, after all, what happens with ext3 and it's >> never been reported as a bug (that I know of). > > It's not ext3 you have to worry about - it's the filesystems that > need special state set up on their pages/buffers for ->writepage to > work correctly that are the problem. You need to call > ->write_begin/->write_end to get the state set up properly. > > If this state is not set up properly, silent data loss will occur > during mmap writes either by ENOSPC or failing to set up writes into > unwritten extents correctly (i.e. we'll be back to where we were in > 2.6.15). > > I don't think ->page_mkwrite can be worked around - we need that to > be called on the first write fault of any mmap()d page to ensure it > is set up correctly for writeback. If we don't get write faults > after the page is mlock()d, then we need the ->page_mkwrite() call > during the mlock() call. Just to be clear - I'm proposing to skip the entire do_wp_page() call by doing a read fault rather than a write fault. If the page wasn't dirty already, it will stay clean and with a non-writable PTE until it gets actually written to, at which point we'll get a write fault and do_wp_page will be invoked as usual. I am not proposing to skip the page_mkwrite() while upgrading the PTE permissions, which I think is what you were arguing against ? -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href