Re: [PATCH 0/2] arm64: hugetlb: Fix page fault loop for sw-dirty/hw-clean contiguous PTEs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/12/2023 17:26, James Houghton wrote:
> It is currently possible for a userspace application to enter a page
> fault loop when using HugeTLB pages implemented with contiguous PTEs
> when HAFDBS is not available. This happens because:
> 1. The kernel may sometimes write PTEs that are sw-dirty but hw-clean
>    (PTE_DIRTY | PTE_RDONLY | PTE_WRITE).

Hi James,

Do you know how this happens?

AFAIK, this is the set of valid bit combinations, and
PTE_RDONLY|PTE_WRITE|PTE_DIRTY is not one of them. Perhaps the real solution is
to understand how this is happening and prevent it?

/*
 * PTE bits configuration in the presence of hardware Dirty Bit Management
 * (PTE_WRITE == PTE_DBM):
 *
 * Dirty  Writable | PTE_RDONLY  PTE_WRITE  PTE_DIRTY (sw)
 *   0      0      |   1           0          0
 *   0      1      |   1           1          0
 *   1      0      |   1           0          1
 *   1      1      |   0           1          x
 *
 * When hardware DBM is not present, the sofware PTE_DIRTY bit is updated via
 * the page fault mechanism. Checking the dirty status of a pte becomes:
 *
 *   PTE_DIRTY || (PTE_WRITE && !PTE_RDONLY)
 */


Thanks,
Ryan


> 2. If, during a write, the CPU uses a sw-dirty, hw-clean PTE in handling
>    the memory access on a system without HAFDBS, we will get a page
>    fault.
> 3. HugeTLB will check if it needs to update the dirty bits on the PTE.
>    For contiguous PTEs, it will check to see if the pgprot bits need
>    updating. In this case, HugeTLB wants to write a sequence of
>    sw-dirty, hw-dirty PTEs, but it finds that all the PTEs it is about
>    to overwrite are all pte_dirty() (pte_sw_dirty() => pte_dirty()),
>    so it thinks no update is necessary.
> 
> Please see this[1] reproducer.
> 
> I think (though I may be wrong) that both step (1) and step (3) are
> buggy.
> 
> The first patch in this series fixes step (3); instead of checking if
> pte_dirty is matching in __cont_access_flags_changed, check pte_hw_dirty
> and pte_sw_dirty separately.
> 
> The second patch in this series makes step (1) less likely to occur.
> Without this patch, we can get the kernel to write a sw-dirty, hw-clean
> PTE with the following steps (showing the relevant VMA flags and pgprot
> bits):
> i.   Create a valid, writable contiguous PTE.
>        VMA vmflags:     VM_SHARED | VM_READ | VM_WRITE
>        VMA pgprot bits: PTE_RDONLY | PTE_WRITE
>        PTE pgprot bits: PTE_DIRTY | PTE_WRITE
> ii.  mprotect the VMA to PROT_NONE.
>        VMA vmflags:     VM_SHARED
>        VMA pgprot bits: PTE_RDONLY
>        PTE pgprot bits: PTE_DIRTY | PTE_RDONLY
> iii. mprotect the VMA back to PROT_READ | PROT_WRITE.
>        VMA vmflags:     VM_SHARED | VM_READ | VM_WRITE
>        VMA pgprot bits: PTE_RDONLY | PTE_WRITE
>        PTE pgprot bits: PTE_DIRTY | PTE_WRITE | PTE_RDONLY
> 
> Applying either one of the two patches in this patchset will fix the
> particular issue with HugeTLB pages implemented with contiguous PTEs.
> It's possible that only one of these patches should be taken, or that
> the right fix is something else entirely.
> 
> [1]: https://gist.github.com/48ca/11d1e466deee032cb35aa8c2280f93b0
> 
> James Houghton (2):
>   arm64: hugetlb: Distinguish between hw and sw dirtiness in
>     __cont_access_flags_changed
>   arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify
> 
>  arch/arm64/include/asm/pgtable.h | 6 ++++++
>  arch/arm64/mm/hugetlbpage.c      | 5 ++++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> 
> base-commit: 645a9a454fdb7e698a63a275edca6a17ef97afc4





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux