Andrew Morton wrote:
But under this proposal, t_sync_datalist just gets removed: the new
ordered-data mode _only_ need to do the sb->inode->page walk. So if I'm
understanding you, the way in which we'd handle any such race is to make
kjournald's writeback of the dirty pages block in lock_page(). Once it
gets the page lock it can look to see if some other thread has mapped the
page to disk.
if I'm right holding number of pages locked, then they won't be locked, but
writeback. of course kjournald can block on writeback as well, but how does
it find pages with *newly allocated* blocks only?
I don't think we'd want kjournald to do that. Even if a page was dirtied
by an overwrite, we'd want to write it back during commit, just from a
quality-of-implementation point of view. If we were to leave these pages
unwritten during commit then a post-recovery file could have a mix of
up-to-five-second-old data and up-to-30-seconds-old data.
trying to implement this I've got to think that there is one significant
difference between t_sync_datalist and sb->inode->page walk: t_sync_datalist
is per-transaction. IOW, it doesn't change once transaction is closed. in
contrast, nothing (currently) would prevent others to modify pages while
commit is in progress. I think this is serious disadvantage of the solution.
what I'd propose is sort of in-core tracker for all data-related IOs in flight
(assigned to specific transaction) and wait for their completion in commit
thread.
thanks, Alex
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html