On Thu, 6 Mar 2025 23:07:04 +0100 David Hildenbrand <david@xxxxxxxxxx> wrote: > On 06.03.25 11:23, David Hildenbrand wrote: > >> /** > >> - * make_folio_secure() - make a folio secure > >> + * __make_folio_secure() - make a folio secure > >> * @folio: the folio to make secure > >> * @uvcb: the uvcb that describes the UVC to be used > >> * > >> @@ -243,14 +276,13 @@ static int expected_folio_refs(struct folio *folio) > >> * -EINVAL if the UVC failed for other reasons. > >> * > >> * Context: The caller must hold exactly one extra reference on the folio > >> - * (it's the same logic as split_folio()) > >> + * (it's the same logic as split_folio()), and the folio must be > >> + * locked. > >> */ > >> -int make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb) > >> +static int __make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb) > > > > One more nit: -EBUSY can no longer be returned from his function, so you > > might just remove it from the doc above. > > > > > > While chasing a very weird folio split bug that seems to result in late > > validation issues (:/), I was wondering if __gmap_destroy_page could > > similarly be problematic. > > > > We're now no longer holding the PTL while performing the operation. > > > > (not that that would explain the issue I am chasing, because > > gmap_destroy_page() is never called in my setup) > > > > Okay, I've been debugging for way to long the weird issue I am seeing, and I > did not find the root cause yet. But the following things are problematic: > > 1) To walk the page tables, we need the mmap lock in read mode. > > 2) To walk the page tables, we must know that a VMA exists > > 3) get_locked_pte() must not be used on hugetlb areas. > > Further, the following things should be cleaned up > > 4) s390_wiggle_split_folio() is only used in that file > > 5) gmap_make_secure() likely should be returning -EFAULT > > > See below, I went with a folio_walk (which also checks for pte_present() > like the old code did, but that should not matter here) so we can get rid of the > get_locked_pte() usage completely. I shall merge this into my patch, thanks a lot! > > > From 1b9a4306b79a352daf80708252d166114e7335de Mon Sep 17 00:00:00 2001 > From: David Hildenbrand <david@xxxxxxxxxx> > Date: Thu, 6 Mar 2025 22:43:43 +0100 > Subject: [PATCH] merge > > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> > --- > arch/s390/include/asm/uv.h | 1 - > arch/s390/kernel/uv.c | 41 ++++++++++++++++++-------------------- > arch/s390/kvm/gmap.c | 2 +- > 3 files changed, 20 insertions(+), 24 deletions(-) > > diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h > index fa33a6ff2fabf..46fb0ef6f9847 100644 > --- a/arch/s390/include/asm/uv.h > +++ b/arch/s390/include/asm/uv.h > @@ -634,7 +634,6 @@ int uv_convert_from_secure_pte(pte_t pte); > int make_hva_secure(struct mm_struct *mm, unsigned long hva, struct uv_cb_header *uvcb); > int uv_convert_from_secure(unsigned long paddr); > int uv_convert_from_secure_folio(struct folio *folio); > -int s390_wiggle_split_folio(struct mm_struct *mm, struct folio *folio, bool split); > > void setup_uv(void); > > diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c > index 63420a5f3ee57..11a1894e63405 100644 > --- a/arch/s390/kernel/uv.c > +++ b/arch/s390/kernel/uv.c > @@ -270,7 +270,6 @@ static int expected_folio_refs(struct folio *folio) > * > * Return: 0 on success; > * -EBUSY if the folio is in writeback or has too many references; > - * -E2BIG if the folio is large; > * -EAGAIN if the UVC needs to be attempted again; > * -ENXIO if the address is not mapped; > * -EINVAL if the UVC failed for other reasons. > @@ -324,17 +323,6 @@ static int make_folio_secure(struct mm_struct *mm, struct folio *folio, struct u > return rc; > } > > -static pte_t *get_locked_valid_pte(struct mm_struct *mm, unsigned long hva, spinlock_t **ptl) > -{ > - pte_t *ptep = get_locked_pte(mm, hva, ptl); > - > - if (ptep && (pte_val(*ptep) & _PAGE_INVALID)) { > - pte_unmap_unlock(ptep, *ptl); > - ptep = NULL; > - } > - return ptep; > -} > - > /** > * s390_wiggle_split_folio() - try to drain extra references to a folio and optionally split > * @mm: the mm containing the folio to work on > @@ -344,7 +332,7 @@ static pte_t *get_locked_valid_pte(struct mm_struct *mm, unsigned long hva, spin > * Context: Must be called while holding an extra reference to the folio; > * the mm lock should not be held. > */ > -int s390_wiggle_split_folio(struct mm_struct *mm, struct folio *folio, bool split) > +static int s390_wiggle_split_folio(struct mm_struct *mm, struct folio *folio, bool split) > { > int rc; > > @@ -361,20 +349,28 @@ int s390_wiggle_split_folio(struct mm_struct *mm, struct folio *folio, bool spli > } > return -EAGAIN; > } > -EXPORT_SYMBOL_GPL(s390_wiggle_split_folio); > > int make_hva_secure(struct mm_struct *mm, unsigned long hva, struct uv_cb_header *uvcb) > { > + struct vm_area_struct *vma; > + struct folio_walk fw; > struct folio *folio; > - spinlock_t *ptelock; > - pte_t *ptep; > int rc; > > - ptep = get_locked_valid_pte(mm, hva, &ptelock); > - if (!ptep) > + mmap_read_lock(mm); > + > + vma = vma_lookup(mm, hva); > + if (!vma) { > + mmap_read_unlock(mm); > + return -EFAULT; > + } > + > + folio = folio_walk_start(&fw, vma, hva, 0); > + if (!folio) { > + mmap_read_unlock(mm); > return -ENXIO; > + } > > - folio = page_folio(pte_page(*ptep)); > folio_get(folio); > /* > * Secure pages cannot be huge and userspace should not combine both. > @@ -385,14 +381,15 @@ int make_hva_secure(struct mm_struct *mm, unsigned long hva, struct uv_cb_header > * KVM_RUN will return -EFAULT. > */ > if (folio_test_hugetlb(folio)) > - rc = -EFAULT; > + rc = -EFAULT; > else if (folio_test_large(folio)) > rc = -E2BIG; > - else if (!pte_write(*ptep)) > + else if (!pte_write(fw.pte) || (pte_val(fw.pte) & _PAGE_INVALID)) > rc = -ENXIO; > else > rc = make_folio_secure(mm, folio, uvcb); > - pte_unmap_unlock(ptep, ptelock); > + folio_walk_end(&fw, vma); > + mmap_read_unlock(mm); > > if (rc == -E2BIG || rc == -EBUSY) > rc = s390_wiggle_split_folio(mm, folio, rc == -E2BIG); > diff --git a/arch/s390/kvm/gmap.c b/arch/s390/kvm/gmap.c > index 21580cfecc6ac..1a88b32e7c134 100644 > --- a/arch/s390/kvm/gmap.c > +++ b/arch/s390/kvm/gmap.c > @@ -41,7 +41,7 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb) > > vmaddr = gfn_to_hva(kvm, gpa_to_gfn(gaddr)); > if (kvm_is_error_hva(vmaddr)) > - rc = -ENXIO; > + rc = -EFAULT; > else > rc = make_hva_secure(gmap->mm, vmaddr, uvcb); >