On Thu, Dec 20, 2012 at 03:10:10PM -0800, Andy Lutomirski wrote: > The onus is currently on filesystems to call file_update_time > somewhere in the page_mkwrite path. This is unfortunate for three > reasons: > > 1. page_mkwrite on a locked page should be fast. ext4, for example, > often sleeps while dirtying inodes. That's an ext4 problem, not a page fault or timestamp update problem. Fix ext4. > 2. The current behavior is surprising -- the timestamp resulting from > an mmaped write will be before the write, not after. This contradicts > the mmap(2) manpage, which says: > > The st_ctime and st_mtime field for a file mapped with PROT_WRITE and > MAP_SHARED will be updated after a write to the mapped region, and > before a subsequent msync(2) with the MS_SYNC or MS_ASYNC flag, if one > occurs. What you propose (time updates in do_writepages()) violates this. msync(MS_ASYNC) doesn't actually start any IO, therefore the timestamp wil not be updated. Besides, what POSIX actually says is: | The st_ctime and st_mtime fields of a file that is mapped with | MAP_SHARED and PROT_WRITE shall be marked for update at some point | in the interval between a write reference to the mapped region and | the next call to msync() with MS_ASYNC or MS_SYNC for that portion | of the file by any process. Which means updating the timestamp during the first write is perfectly acceptible. Indeed, by definition, we are compliant with the man page because the update is after the write has occurred. That is, the write triggered the page fault, so the page fault processing where we update the timestamps is definitely after the write occurred. :) > 3. (An ulterior motive) I'd like to use hardware dirty tracking for > shared, locked, writable mappings (instead of page faults). Moving > important work out of the page_mkwrite path is an important first step. I don't think you're going to get very far doing this. page_mkwrite needs to do: a) block allocation in page_mkwrite() for faults over holes to detect ENOSPC conditions immediately rather than in writeback when such an error results in data loss. b) detect writes over unwritten extents so that the pages in the page cache can be set up correctly for conversion to occur during writeback. Correcting these two problems was the reason for introducing page_mkwrite in the first place - we need to do this stuff before the page fault is completed, and that means, by definition, page_mkwrite needs to be able to block. Moving c/mtime updates out of the way does not, in any way, change these requirements. Perhaps you should implement everything you want to do inside ext4 first, so we can get an idea of exactly what you want page_mkwrite() to do, how you want it to behave, and how you expect filesystems to handle the above situations correctly.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html