On 09.02.2018 10:34, Janosch Frank wrote: > Since the z10 s390 does support 1M pages, but whereas hugetlbfs > support was added quite fast, KVM always used standard 4k pages for > guest backings. > > This patchset adds full support for 1M huge page backings for s390 > KVM guests. I.e. we also support VSIE (nested vms) for these guests > and are therefore able to run all combinations of backings for all > layers of guests. > > When running a VSIE guest in a huge page backed guest, we need to > split some huge pages to be able to set granular protection. This way > we avoid a prot/unprot cycle if prefixes and VSIE pages containing > level 3 gmap DAT tables share the same segment, as the prefix has to > be accessible at all times and the VSIE page has to be write > protected. > > Branch: > git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git hlp_vsie > https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git/log/?h=hlp_vsie > A general proposal: We will have split PMDs with fake PGSTE. This is nasty but needed. I think we should hinder virtualization from making use of these. Just like we already do for vSIE. Should we make the KVM_CAP_S390_HPAGE a configuration option? Without it being set, don't allow mapping huge pages into the GMAP. Everything as usual. With it being set (by user space when it thinks we need huge pages), allow mapping huge pages into the GMAP AND - Explicitly disable CMMA. Right now we trust on user space to do the right thing. ecb2 &= ~ECB2_CMMA - Disable PFMFI -> ecb2 &= ~ECB2_PFMFI - Disable SKF by setting scb->ictl |= ICTL_ISKE | ICTL_SSKE | ICTL_RRBE So user space has to explicitly indicate and allow huge pages. This will result in all instructions that touch the PGSTE getting intercepted, so we can properly work on the huge PMDs instead. -- Thanks, David / dhildenb -- To unsubscribe from this list: send the line "unsubscribe linux-s390" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html