On 03/01/2012 03:41 AM, Jan Kara wrote: > Hello, > > to provide reliable support for filesystem freezing, filesystems need to have > complete control over when metadata is changed. In particular, > file_update_time() calls from page fault code make it impossible for > filesystems to prevent inodes from being dirtied while the filesystem is > frozen. > > To fix the issue, this patch set changes page fault code to call > file_update_time() only when ->page_mkwrite() callback is not provided. If the > callback is provided, it is the responsibility of the filesystem to perform > update of i_mtime / i_ctime if needed. We also push file_update_time() call > to all existing ->page_mkwrite() implementations if the time update does not > obviously happen by other means. If you know your filesystem does not need > update of modification times in ->page_mkwrite() handler, please speak up and > I'll drop the patch for your filesystem. > > As a side note, an alternative would be to remove call of file_update_time() > from page fault code altogether and require all filesystems needing it to do > that in their ->page_mkwrite() implementation. That is certainly possible > although maybe slightly inefficient and would require auditting 100+ > vm_operations_structs *shiver*. IMO updating file times should happen when changes get written out, not when a page is made writable, for two reasons: 1. Correctness. With the current approach, it's very easy for files to be changed after the last mtime update -- any changes between mkwrite and actual writeback won't affect mtime. 2. Performance. I have an application (presumably guessable from my email address) for which blocking in page_mkwrite is an absolute show-stopper. (In fact it's so bad that we reverted back to running on Windows until I hacked up a kernel to not do this.) I have an incorrect patch [1] to fix it, but I haven't gotten around to a real fix. (I also have stable pages reverted in my kernel. Some day I'll submit a patch to make it a filesystem option. Or maybe it should even be a block device / queue property like the alignment offset and optimal io size -- there are plenty of block device and file combinations which don't benefit at all from stable pages.) I'd prefer if file_update_time in page_mkwrite didn't proliferate. A better fix is probably to introduce a new inode flag, update it when a page is undirtied, and then dirty and write the inode from the writeback path. (Kind of like my patch, but with an inode flag instead of a page flag, and with the file_update_time done from the fs.) [1] http://patchwork.ozlabs.org/patch/122516/ --Andy -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html