Hey Jan, On Mon, Nov 23, 2020 at 6:34 AM Jan Kara <jack@xxxxxxx> wrote: > > ... > > > > The remaining checksum changes due to write-protect failures _seem_ to be > > a race between our write-protect with write_cache_pages() and writeback/sync. > > But I'm not exactly sure as it's been hard to find the timing/steps > > for both threads. > > The idea is, > > > > Our write_cache_pages() during commit / > > ext4_journalled_submit_inode_data_buffers() > > depends on PAGECACHE_TAG_DIRTY being set, for pagevec_lookup_range_tag() > > to put such pages in the array to be write-protected with > > clear_page_dirty_for_io(). > > > > With a debug patch to dump_stack() callers of > > __test_set_page_writeback(), that can > > xas_clear_mark() it, _while_ the page is still going in our call to > > write_cache_pages(), > > we see this: wb_workfn() -> ext4_writepage() -> ext4_ext4_bio_write_page(), > > I guess there was something between wb_workfn() and ext4_writepage(), > wasn't there? There should be write_cache_pages()... > Yes, you're right, I oversimplified the function path between the first two. :) (I'll use '...' in the future to make that clearer.) > > i.e., _not_ going through ext4_journalled_writepage(), which > > knows/waits on journal. > > The other leg of ext4_writepage() _doesn't_, and thus can clear the > > tag and prevent > > write-protect while the journal commit / our write_cache_pages() is running. > > So I don't think this is quite it. If there are two writebacks racing, > either of them will writeprotect the page which is enough for commit to be > safe (as the mapped write will then trigger ext4_page_mkwrite() which will > wait for the end of running commit in jbd2_journal_get_write_access()). Or > am I missing something? But there must be still some race as you can still > see occasional checksum changes... So I must be missing something. > Actually you're right, I missed that -- either writeback will writeprotect the page. There's still something else going on, but it's me not you who missed something. :) Thanks for pointing that out. > > Since the switch to either leg is PageChecked(), I guess this might be > > due to clearing it right away in ext4_journalled_writepage(), before > > write-protection. > > The write-protection happens in clear_page_dirty_for_io() call so that's > before our ->writepage() callback is even called. > Right, I meant write-protection by the other/racing writeback thread. But please nevermind, as this doesn't seem to be the corner case anymore. Thanks! > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR -- Mauricio Faria de Oliveira