Re: [PATCH] mm/mremap: Fix move_normal_pmd/retract_page_tables race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07.10.24 23:42, Jann Horn wrote:
In mremap(), move_page_tables() looks at the type of the PMD entry and the
specified address range to figure out by which method the next chunk of
page table entries should be moved.
At that point, the mmap_lock is held in write mode, but no rmap locks are
held yet. For PMD entries that point to page tables and are fully covered
by the source address range, move_pgt_entry(NORMAL_PMD, ...) is called,
which first takes rmap locks, then does move_normal_pmd().
move_normal_pmd() takes the necessary page table locks at source and
destination, then moves an entire page table from the source to the
destination.

The problem is: The rmap locks, which protect against concurrent page table
removal by retract_page_tables() in the THP code, are only taken after the
PMD entry has been read and it has been decided how to move it.
So we can race as follows (with two processes that have mappings of the
same tmpfs file that is stored on a tmpfs mount with huge=advise); note
that process A accesses page tables through the MM while process B does it
through the file rmap:


process A                      process B
=========                      =========
mremap
   mremap_to
     move_vma
       move_page_tables
         get_old_pmd
         alloc_new_pmd
                       *** PREEMPT ***
                                madvise(MADV_COLLAPSE)
                                  do_madvise
                                    madvise_walk_vmas
                                      madvise_vma_behavior
                                        madvise_collapse
                                          hpage_collapse_scan_file
                                            collapse_file
                                              retract_page_tables
                                                i_mmap_lock_read(mapping)
                                                pmdp_collapse_flush
                                                i_mmap_unlock_read(mapping)
         move_pgt_entry(NORMAL_PMD, ...)
           take_rmap_locks
           move_normal_pmd
           drop_rmap_locks

When this happens, move_normal_pmd() can end up creating bogus PMD entries
in the line `pmd_populate(mm, new_pmd, pmd_pgtable(pmd))`.
The effect depends on arch-specific and machine-specific details; on x86,
you can end up with physical page 0 mapped as a page table, which is likely
exploitable for user->kernel privilege escalation.


Fix the race by letting process B recheck that the PMD still points to a
page table after the rmap locks have been taken. Otherwise, we bail and let
the caller fall back to the PTE-level copying path, which will then bail
immediately at the pmd_none() check.

Bug reachability: Reaching this bug requires that you can create shmem/file
THP mappings - anonymous THP uses different code that doesn't zap stuff
under rmap locks. File THP is gated on an experimental config flag
(CONFIG_READ_ONLY_THP_FOR_FS), so on normal distro kernels you need shmem
THP to hit this bug. As far as I know, getting shmem THP normally requires
that you can mount your own tmpfs with the right mount flags, which would
require creating your own user+mount namespace; though I don't know if some
distros maybe enable shmem THP by default or something like that.

Bug impact: This issue can likely be used for user->kernel privilege
escalation when it is reachable.

Cc: stable@xxxxxxxxxxxxxxx
Fixes: 1d65b771bc08 ("mm/khugepaged: retract_page_tables() without mmap or vma lock")
Closes: https://project-zero.issues.chromium.org/371047675
Co-developed-by: David Hildenbrand <david@xxxxxxxxxx>
Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
---
@David: please confirm we can add your Signed-off-by to this patch after
the Co-developed-by.

Sure, thanks for sending this out!

Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux