Re: [PATCH resend RFC 0/9] s390: fixes, cleanups and optimizations for page table walkers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14.09.21 18:50, Claudio Imbrenda wrote:
On Thu,  9 Sep 2021 18:22:39 +0200
David Hildenbrand <david@xxxxxxxxxx> wrote:

Resend because I missed ccing people on the actual patches ...

RFC because the patches are essentially untested and I did not actually
try to trigger any of the things these patches are supposed to fix. It

this is an interesting series, and the code makes sense, but I would
really like to see some regression tests, and maybe even some
selftests to trigger (at least some of) the issues.

Yep, it most certainly needs regression testing before picking any of this. selftests would be great, but I won't find time for it in the foreseeable future.


the follow-up question is: how did we manage to go on so long without
noticing these issues? :D

Excellent question - I guess we simply weren't aware of the dos and don'ts when dealing with process page tables :)


merely matches my current understanding (and what other code does :) ). I
did compile-test as far as possible.

After learning more about the wonderful world of page tables and their
interaction with the mmap_sem and VMAs, I spotted some issues in our
page table walkers that allow user space to trigger nasty behavior when
playing dirty tricks with munmap() or mmap() of hugetlb. While some issues
should be hard to trigger, others are fairly easy because we provide
conventient interfaces (e.g., KVM_S390_GET_SKEYS and KVM_S390_SET_SKEYS).

Future work:
- Don't use get_locked_pte() when it's not required to actually allocate
   page tables -- similar to how storage keys are now handled. Examples are
   get_pgste() and __gmap_zap.
- Don't use get_locked_pte() and instead let page fault logic allocate page
   tables when we actually do need page tables -- also, similar to how
   storage keys are now handled. Examples are set_pgste_bits() and
   pgste_perform_essa().
- Maybe switch to mm/pagewalk.c to avoid custom page table walkers. For
   __gmap_zap() that's very easy.

Cc: Christian Borntraeger <borntraeger@xxxxxxxxxx>
Cc: Janosch Frank <frankja@xxxxxxxxxxxxx>
Cc: Cornelia Huck <cohuck@xxxxxxxxxx>
Cc: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx>
Cc: Heiko Carstens <hca@xxxxxxxxxxxxx>
Cc: Vasily Gorbik <gor@xxxxxxxxxxxxx>
Cc: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>
Cc: Gerald Schaefer <gerald.schaefer@xxxxxxxxxxxxx>
Cc: Ulrich Weigand <Ulrich.Weigand@xxxxxxxxxx>

David Hildenbrand (9):
   s390/gmap: validate VMA in __gmap_zap()
   s390/gmap: don't unconditionally call pte_unmap_unlock() in
     __gmap_zap()
   s390/mm: validate VMA in PGSTE manipulation functions
   s390/mm: fix VMA and page table handling code in storage key handling
     functions
   s390/uv: fully validate the VMA before calling follow_page()
   s390/pci_mmio: fully validate the VMA before calling follow_pte()
   s390/mm: no need for pte_alloc_map_lock() if we know the pmd is
     present
   s390/mm: optimize set_guest_storage_key()
   s390/mm: optimize reset_guest_reference_bit()

  arch/s390/kernel/uv.c    |   2 +-
  arch/s390/mm/gmap.c      |  11 +++-
  arch/s390/mm/pgtable.c   | 109 +++++++++++++++++++++++++++------------
  arch/s390/pci/pci_mmio.c |   4 +-
  4 files changed, 89 insertions(+), 37 deletions(-)


base-commit: 7d2a07b769330c34b4deabeed939325c77a7ec2f



--
Thanks,

David / dhildenb




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux