On 11/30/20 3:08 PM, Catalin Marinas wrote: > On Mon, Nov 30, 2020 at 09:55:00AM +0530, Anshuman Khandual wrote: >> On 11/27/20 3:14 PM, Catalin Marinas wrote: >>> On Fri, Nov 27, 2020 at 09:22:24AM +0100, Christophe Leroy wrote: >>>> Le 27/11/2020 à 06:06, Anshuman Khandual a écrit : >>>>> This adds validation tests for dirtiness after write protect conversion for >>>>> each page table level. This is important for platforms such as arm64 that >>>>> removes the hardware dirty bit while making it an write protected one. This >>>>> also fixes pxx_wrprotect() related typos in the documentation file. >>>> >>>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c >>>>> index c05d9dcf7891..a5be11210597 100644 >>>>> --- a/mm/debug_vm_pgtable.c >>>>> +++ b/mm/debug_vm_pgtable.c >>>>> @@ -70,6 +70,7 @@ static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot) >>>>> WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte)))); >>>>> WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte)))); >>>>> WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte)))); >>>>> + WARN_ON(pte_dirty(pte_wrprotect(pte))); >>>> >>>> Wondering what you are testing here exactly. >>>> >>>> Do you expect that if PTE has the dirty bit, it gets cleared by >>>> pte_wrprotect() ? >>>> >>>> Powerpc doesn't do that, it only clears the RW bit but the dirty >>>> bit remains if it is set, until you call pte_mkclean() explicitely. >>> >>> Arm64 has an unusual way of setting a hardware dirty "bit", it actually >>> clears the PTE_RDONLY bit. The pte_wrprotect() sets the PTE_RDONLY bit >>> back and we can lose the dirty information. Will found this and posted >>> patches to fix the arm64 pte_wprotect() to set a software PTE_DIRTY if >>> !PTE_RDONLY (we do this for ptep_set_wrprotect() already). My concern >>> was that we may inadvertently make a fresh/clean pte dirty with such >>> change, hence the suggestion for the test. >>> >>> That said, I think we also need a test in the other direction, >>> pte_wrprotect() should preserve any dirty information: >>> >>> WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte)))); >> >> This seems like a generic enough principle which all platforms should >> adhere to. But the proposed test WARN_ON(pte_dirty(pte_wrprotect(pte))) >> might fail on some platforms if the page table entry came in as a dirty >> one and pte_wrprotect() is not expected to alter the dirty state. > > Ah, so do we have architectures where entries in protection_map[] are > already dirty? If those are valid, maybe the check should be: Okay, I did not imply that actually. The current position for these new tests in respective pxx_basic_tests() functions is right at the end and hence the pxx might have already gone through some changes from the time it was originally created with pfn_pxx(). The entry here is not starting from the beginning. It is not expected as well, per design. So dirty bit might or might not be there depending on all the previous test sequences leading upto these new ones. IIUC, Christophe mentioned the fact that on platforms like powerpc, dirty bit just remains unchanged during pte_wprotect(). So the current test WARN_ON(pte_dirty(pte_wrprotect(pte))) will not work on powerpc if the previous tests leading upto that point has got the dirty bit set. This is irrespective of how it was created with pfn_pte() from protection_map[] originally at the beginning. > > WARN_ON(!pte_dirty(pte) && pte_dirty(pte_wrprotect(pte))); > >> Instead, should we just add the following two tests, which would ensure >> that pte_wrprotect() never alters the dirty state of a page table entry. >> >> WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte)))); >> WARN_ON(pte_dirty(pte_wrprotect(pte_mkclean(pte)))); > > These should be added as additional tests. However, my initial thought Okay, will add them. > was to check whether pte_wrprotect() on a new pte created from a > protection_map[] entry directly would inadvertently dirty it. On arm64, > that means a protection_map[] entry missing PTE_RDONLY. A pte_mkclean() > would set PTE_RDONLY, so we'd miss such check. > To achieve this, we could move the test right at the beginning just after the pxx gets created from protection_map[], with a comment explaining the rationale.