David Chinner wrote:
Using a new API for new functionality is a bad thing?
if existing API can be used ...
No, it doesn't provide the same functionality. Firstly, XFS attaches a different I/O completion to delalloc writes to allow us to update the file size when the write is beyond the current on disk EOF. This code cannot do that as all it does is allocation and present "normal looking" buffers to the generic code path.
good point, I was going to take care of it in a separate patch to support data=ordered.
Secondly, apart from delalloc, XFS cannot use the generic code paths for writeback because unwritten extent conversion also requires custom I/O completion handlers. Given that __mpage_writepage() only calls ->writepage when it is confused, XFS simply cannot use this API.
this doesn't mean fs/mpage.c should go, right?
Also, looking at the way mpage_da_map_blocks() is done - if we have an 128MB delalloc extent - ext4 will allocate that will allocate it in one go, right? What happens if we then crash after only writing a few megabytes of that extent? stale data exposure? XFS can allocate multiple gigabytes in a single get_blocks call so even if ext4 can't do this, it's a problem for XFS.....
what happens if IO to 2nd MB is completed, while IO to 1st MB is not (probably sitting in queue) ? do you update on-disk size in this case? how do you track this?
So without the ability to attach specific I/O completions to bios or support for unwritten extents directly in __mpage_writepage, there is no way XFS can use this "generic" delayed allocation code.
I didn't say "generic", see Subject: :) thanks, Alex - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html