On Mon, Nov 30, 2020 at 04:28:20PM +0530, Anshuman Khandual wrote: > On 11/30/20 3:08 PM, Catalin Marinas wrote: > > On Mon, Nov 30, 2020 at 09:55:00AM +0530, Anshuman Khandual wrote: > >> On 11/27/20 3:14 PM, Catalin Marinas wrote: > >>> On Fri, Nov 27, 2020 at 09:22:24AM +0100, Christophe Leroy wrote: > >>>> Le 27/11/2020 à 06:06, Anshuman Khandual a écrit : > >>>>> This adds validation tests for dirtiness after write protect conversion for > >>>>> each page table level. This is important for platforms such as arm64 that > >>>>> removes the hardware dirty bit while making it an write protected one. This > >>>>> also fixes pxx_wrprotect() related typos in the documentation file. > >>>> > >>>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > >>>>> index c05d9dcf7891..a5be11210597 100644 > >>>>> --- a/mm/debug_vm_pgtable.c > >>>>> +++ b/mm/debug_vm_pgtable.c > >>>>> @@ -70,6 +70,7 @@ static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot) > >>>>> WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte)))); > >>>>> WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte)))); > >>>>> WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte)))); > >>>>> + WARN_ON(pte_dirty(pte_wrprotect(pte))); > >>>> > >>>> Wondering what you are testing here exactly. > >>>> > >>>> Do you expect that if PTE has the dirty bit, it gets cleared by > >>>> pte_wrprotect() ? > >>>> > >>>> Powerpc doesn't do that, it only clears the RW bit but the dirty > >>>> bit remains if it is set, until you call pte_mkclean() explicitely. > >>> > >>> Arm64 has an unusual way of setting a hardware dirty "bit", it actually > >>> clears the PTE_RDONLY bit. The pte_wrprotect() sets the PTE_RDONLY bit > >>> back and we can lose the dirty information. Will found this and posted > >>> patches to fix the arm64 pte_wprotect() to set a software PTE_DIRTY if > >>> !PTE_RDONLY (we do this for ptep_set_wrprotect() already). My concern > >>> was that we may inadvertently make a fresh/clean pte dirty with such > >>> change, hence the suggestion for the test. > >>> > >>> That said, I think we also need a test in the other direction, > >>> pte_wrprotect() should preserve any dirty information: > >>> > >>> WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte)))); > >> > >> This seems like a generic enough principle which all platforms should > >> adhere to. But the proposed test WARN_ON(pte_dirty(pte_wrprotect(pte))) > >> might fail on some platforms if the page table entry came in as a dirty > >> one and pte_wrprotect() is not expected to alter the dirty state. > > > > Ah, so do we have architectures where entries in protection_map[] are > > already dirty? If those are valid, maybe the check should be: > > Okay, I did not imply that actually. The current position for these new > tests in respective pxx_basic_tests() functions is right at the end and > hence the pxx might have already gone through some changes from the time > it was originally created with pfn_pxx(). The entry here is not starting > from the beginning. It is not expected as well, per design. So dirty bit > might or might not be there depending on all the previous test sequences > leading upto these new ones. > > IIUC, Christophe mentioned the fact that on platforms like powerpc, dirty > bit just remains unchanged during pte_wprotect(). So the current test > WARN_ON(pte_dirty(pte_wrprotect(pte))) will not work on powerpc if the > previous tests leading upto that point has got the dirty bit set. This is > irrespective of how it was created with pfn_pte() from protection_map[] > originally at the beginning. [...] > To achieve this, we could move the test right at the beginning just after > the pxx gets created from protection_map[], with a comment explaining the > rationale. OK, this makes sense. Thanks for the clarification. -- Catalin