On 30/04/2024 14:55, Will Deacon wrote: > On Tue, Apr 30, 2024 at 02:31:38PM +0100, Ryan Roberts wrote: >> __split_huge_pmd_locked() can be called for a present THP, devmap or >> (non-present) migration entry. It calls pmdp_invalidate() >> unconditionally on the pmdp and only determines if it is present or not >> based on the returned old pmd. >> >> But arm64's pmd_mkinvalid(), called by pmdp_invalidate(), >> unconditionally sets the PMD_PRESENT_INVALID flag, which causes future >> pmd_present() calls to return true - even for a swap pmd. Therefore any >> lockless pgtable walker could see the migration entry pmd in this state >> and start interpretting the fields (e.g. pmd_pfn()) as if it were >> present, leading to BadThings (TM). GUP-fast appears to be one such >> lockless pgtable walker. >> >> While the obvious fix is for core-mm to avoid such calls for non-present >> pmds (pmdp_invalidate() will also issue TLBI which is not necessary for >> this case either), all other arches that implement pmd_mkinvalid() do it >> in such a way that it is robust to being called with a non-present pmd. >> So it is simpler and safer to make arm64 robust too. This approach means >> we can even add tests to debug_vm_pgtable.c to validate the required >> behaviour. >> >> This is a theoretical bug found during code review. I don't have any >> test case to trigger it in practice. >> >> Cc: stable@xxxxxxxxxxxxxxx >> Fixes: 53fa117bb33c ("arm64/mm: Enable THP migration") >> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx> >> --- >> >> Hi all, >> >> v1 of this fix [1] took the approach of fixing core-mm to never call >> pmdp_invalidate() on a non-present pmd. But Zi Yan highlighted that only arm64 >> suffers this problem; all other arches are robust. So his suggestion was to >> instead make arm64 robust in the same way and add tests to validate it. Despite >> my stated reservations in the context of the v1 discussion, having thought on it >> for a bit, I now agree with Zi Yan. Hence this post. >> >> Andrew has v1 in mm-unstable at the moment, so probably the best thing to do is >> remove it from there and have this go in through the arm64 tree? Assuming there >> is agreement that this approach is right one. >> >> This applies on top of v6.9-rc5. Passes all the mm selftests on arm64. >> >> [1] https://lore.kernel.org/linux-mm/20240425170704.3379492-1-ryan.roberts@xxxxxxx/ >> >> Thanks, >> Ryan >> >> >> arch/arm64/include/asm/pgtable.h | 12 +++++-- >> mm/debug_vm_pgtable.c | 61 ++++++++++++++++++++++++++++++++ >> 2 files changed, 71 insertions(+), 2 deletions(-) >> >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h >> index afdd56d26ad7..7d580271a46d 100644 >> --- a/arch/arm64/include/asm/pgtable.h >> +++ b/arch/arm64/include/asm/pgtable.h >> @@ -511,8 +511,16 @@ static inline int pmd_trans_huge(pmd_t pmd) >> >> static inline pmd_t pmd_mkinvalid(pmd_t pmd) >> { >> - pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID)); >> - pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID)); >> + /* >> + * If not valid then either we are already present-invalid or we are >> + * not-present (i.e. none or swap entry). We must not convert >> + * not-present to present-invalid. Unbelievably, the core-mm may call >> + * pmd_mkinvalid() for a swap entry and all other arches can handle it. >> + */ >> + if (pmd_valid(pmd)) { >> + pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID)); >> + pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID)); >> + } >> >> return pmd; >> } > > Acked-by: Will Deacon <will@xxxxxxxxxx> Thanks > > But it might be worth splitting the tests from the fix to make backporting > easier. Yes good point. I'll leave this hanging for today to see if any more comments come in, and will re-post tomorrow as 2 patches. I assume we need to go fast to catch 6.9. > > Catalin -- I assume you'll pick this up, but please shout if you want me > to take it instead. > > Will