On Wed, Feb 15, 2023 at 03:03:09PM +0500, Muhammad Usama Anjum wrote: > On 2/15/23 1:59 AM, Peter Xu wrote: > [..] > >>>> static inline bool is_pte_written(pte_t pte) > >>>> { > >>>> if ((pte_present(pte) && pte_uffd_wp(pte)) || > >>>> (pte_swp_uffd_wp_any(pte))) > >>>> return false; > >>>> return (pte_present(pte) || is_swap_pte(pte)); > >>>> } > >>> > >>> Could you explain why you don't want to return dirty for !present? A page > >>> can be written then swapped out. Don't you want to know that happened > >>> (from dirty tracking POV)? > >>> > >>> The code looks weird to me too.. We only have three types of ptes: (1) > >>> present, (2) swap, (3) none. > >>> > >>> Then, "(pte_present() || is_swap_pte())" is the same as !pte_none(). Is > >>> that what you're really looking for? > >> Yes, this is what I've been trying to do. I'll use !pte_none() to make it > >> simpler. > > > > Ah I think I see what you wanted to do now.. But I'm afraid it won't work > > for all cases. > > > > So IIUC the problem is anon pte can be empty, but since uffd-wp bit doesn't > > persist on anon (but none) ptes, then we got it lost and we cannot identify > > it from pages being written. Your solution will solve problem for > > anonymous, but I think it'll break file memories. > > > > Example: > > > > Consider one shmem page that got mapped, write protected (using UFFDIO_WP > > ioctl), written again (removing uffd-wp bit automatically), then zapped. > > The pte will be pte_none() but it's actually written, afaiu. > > > > Maybe it's time we should introduce UFFD_FEATURE_WP_ZEROPAGE, so we'll need > > to install pte markers for anonymous too (then it will work similarly like > > shmem/hugetlbfs, that we'll report writting to zero pages), then you'll > > need to have the new UFFD_FEATURE_WP_ASYNC depend on it. With that I think > > you can keep using the old check and it should start to work. > > > > Please let me know if my understanding is correct above. > Thank you for identifying it. Your understanding seems on point. I'll have > research things up about PTE Markers. I'm looking at your patches about it > [1]. Can you refer me to "mm alignment sessions" discussion in form of > presentation or if any transcript is available? No worry now, after a second thought I think zero page is better than pte markers, and I've got a patch that works for it here by injecting zero pages for anonymous: https://lore.kernel.org/all/20230215210257.224243-1-peterx@xxxxxxxxxx/ I think we'd also better to enforce your new WP_ASYNC feature bit to depend on this one, so fail the UFFDIO_API if WP_ASYNC && !WP_ZEROPAGE. Could you please try by rebasing your work upon this one? Hope it'll work for you already. Note again that you'll need to go back to the old is_pte|pmd_written() to make things work always, I think. [...] > I truly understand how you feel about export_prev_to_out(). It is really > difficult to understand. Even I had to made a hard try to come up with the > current code to avoid consuming a lot of kernel's memory while giving user > the compact output. I can surely map both of these with a dirty looking > macro. But I'm unable to find a decent macro to replace these. I think I'll > put a comment some where to explain whats going-on. So maybe I still missed something? I'll read the new version when it comes. Thanks, -- Peter Xu