Peter Xu <peterx@xxxxxxxxxx> writes: > On Wed, Aug 10, 2022 at 08:53:49AM +0800, Huang, Ying wrote: >> Peter Xu <peterx@xxxxxxxxxx> writes: >> >> > On Tue, Aug 09, 2022 at 04:40:12PM +0800, Huang, Ying wrote: >> [snip] >> > >> >> I don't find pte_dirty() is synced to PageDirty() as in >> >> try_to_migrate_one(). Is it a issue in the original code? >> > >> > I think it has? There is: >> > >> > /* Set the dirty flag on the folio now the pte is gone. */ >> > if (pte_dirty(pteval)) >> > folio_mark_dirty(folio); >> > >> >> Sorry, my original words are confusing. Yes, there's dirty bit syncing >> in try_to_migrate_one(). But I don't find that in migrate_device.c >> >> $ grep dirty mm/migrate_device.c >> if (pte_soft_dirty(pte)) >> swp_pte = pte_swp_mksoft_dirty(swp_pte); >> if (pte_swp_soft_dirty(pte)) >> swp_pte = pte_swp_mksoft_dirty(swp_pte); >> entry = pte_mkwrite(pte_mkdirty(entry)); >> >> I guess that migrate_device.c is used to migrate between CPU visible >> page to CPU un-visible page (device visible), so the rule is different? > > IIUC migrate_vma_collect() handles migrations for both directions (RAM <-> > device mem). That's correct. > Yeah, indeed I also didn't see how migrate_vma_collect_pmd() handles the > carry-over of pte dirty to page dirty, which looks a bit odd. I also don't > see why the dirty bit doesn't need to be maintained, e.g. when a previous > page was dirty then after migration of ram->dev->ram it seems to be clean > with current code. That's a bug - it does need to be maintained. migrate_vma_*() currently only works with anonymous private mappings. We could still loose data if we attempt (but fail) to migrate a page that has been swapped in from disk though, depending on the precise sequence. Will post a fix for this, thanks for pointing it out. > Another scenario is, even if the page was clean, as long as page migrated > to device mem, device DMAed to the page, then page migrated back to RAM. I > also didn't see how we could detect the DMAs and set pte/page dirty > properly after migrated back. That would be up to the driver, unless we assume the page is always dirty which is probably not a bad default. In practice I don't think this will currently be a problem as any pages migrated to the device won't have pages allocated in swap and this only works with private anonymous mappings. But I think we should fix it anyway so will include it in the fix. > Copy Alistair and Jason.. Thanks. I will take a look at this series too, but probably won't get to it until next week. - Alistair