As soon as storage keys are enabled we need to work around of zero page mappings to prevent inconsistencies between storage keys and pgste. Otherwise following data corruption could happen: 1) guest enables storage key 2) guest sets storage key for not mapped page X -> change goes to PGSTE 3) guest reads from page X -> as X was not dirty before, the page will be zero page backed, storage key from PGSTE for X will go to storage key for zero page 4) guest sets storage key for not mapped page Y (same logic as above 5) guest reads from page Y -> as Y was not dirty before, the page will be zero page backed, storage key from PGSTE for Y will got to storage key for zero page overwriting storage key for X While holding the mmap sem, we are safe before changes on entries we already fixed. As sske and host large pages are also mutual exclusive we do not even need to retry the fixup_user_fault. Signed-off-by: Dominik Dingel <dingel@xxxxxxxxxxxxxxxxxx> Acked-by: Christian Borntraeger <borntraeger@xxxxxxxxxx> Signed-off-by: Martin Schwidefsky <schwidefsky@xxxxxxxxxx> --- arch/s390/Kconfig | 3 +++ arch/s390/mm/pgtable.c | 15 +++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 05c78bb..4e04e63 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -1,6 +1,9 @@ config MMU def_bool y +config NOZEROPAGE + def_bool y + config ZONE_DMA def_bool y diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index ab55ba8..6321692 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -1309,6 +1309,15 @@ static int __s390_enable_skey(pte_t *pte, unsigned long addr, pgste_t pgste; pgste = pgste_get_lock(pte); + /* + * Remove all zero page mappings, + * after establishing a policy to forbid zero page mappings + * following faults for that page will get fresh anonymous pages + */ + if (is_zero_pfn(pte_pfn(*pte))) { + ptep_flush_direct(walk->mm, addr, pte); + pte_val(*pte) = _PAGE_INVALID; + } /* Clear storage key */ pgste_val(pgste) &= ~(PGSTE_ACC_BITS | PGSTE_FP_BIT | PGSTE_GR_BIT | PGSTE_GC_BIT); @@ -1323,10 +1332,16 @@ void s390_enable_skey(void) { struct mm_walk walk = { .pte_entry = __s390_enable_skey }; struct mm_struct *mm = current->mm; + struct vm_area_struct *vma; down_write(&mm->mmap_sem); if (mm_use_skey(mm)) goto out_up; + + for (vma = mm->mmap; vma; vma = vma->vm_next) + vma->vm_flags |= VM_NOZEROPAGE; + mm->def_flags |= VM_NOZEROPAGE; + walk.mm = mm; walk_page_range(0, TASK_SIZE, &walk); mm->context.use_skey = 1; -- 1.8.5.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html