Re: [PATCH v4 0/9] KVM/s390: Hugetlbfs enablement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27.06.2018 15:55, Janosch Frank wrote:
> So, after the userfaultfd fix postcopy does work now, but vSIE in
> combination with paging can still result in crashing g3s. Therefore we
> split up the series and only integrate non-vSIE support for now.
> 
> Branch:
> git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git hlp_vsie
> https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git/log/?h=hlp_vsie
> 

So, after some off-lists discussions my view on things

1. No dirty tracking on user space tables is necessary. All KVM internal
guest memory accesses have to manually set dirty bits (and we seem to do
very well already). All user space (QEMU) accesses have to do that on
their own (also usually already done but we might miss some corner cases).

2. Splitting of huge PMDs is not necessary without vSIE and only a
performance improvement for migration ("dirty 4k vs 1m").

3. PGSTE for split PMDs could be avoided just like for shadow GMAPs, we
only have to care about locking at the two notifier bits.

So in summary, I would really like to see a v1 without split of PMDs and
a v2 with lazy splitting of all PMDs. This might require some work, but
it should be fairly easy to get v1 implemented (in contrast to the
splitting magic). v3 can then take care of vSIE. If I am not completely
wrong, even THP could be handled fairly easy in a v1 already.

Especially v2 can handle it similar to x86:

a) dirty tracking: when enabling, mark all PMDs in the GMAP as R/O. All
new *read* GMAP faults can map PMDs. All new GMAP *write* faults will
only create split PMDs and do this lazily (replacing protected PMDs by
split PMDs). So after a write fault, only one new 4k page will be mapped
writable and marked as dirty. We can detect if dirty tracking is turned
on/off using the kvm slots. We might be able to simply make global
decisions to simplify the implementation "one slot is set to do dirty
tracking, do it for all".

b) mprotect/protect_range: As long as we have no restricted protection
(e.g. prefix only sets notifier bit) there is no need to split. You can
use a notifier bit on the PMD. Once we actually want to restrict access
(sub PMD protection for e.g. gmap shadow - we should forbid this
initially on huge PMDs), we have to split. This means, without vSIE we
never have to split except for a).

The nice thing about this approach (if I didn't miss anything) is that
the PMDs with a guest prefix on them are only split if dirty tracking is
active (or vSIE requires it via mprotect()) - in contrast to this
series. And in contrast to this series, *all* memory will be dirty
tracked using 4k instead of 1MB pages, which makes a lot of sense - even
1 CPU can otherwise set so many 1MB pages dirty easily during migration.

> 
> V4:
> 	* Split up vSIE patches
> 	* Added hpage module parameter disabling vSIE when set.
> 	* Added HPAGE capability that has to be enabled for huge guests
> 
> V3:
> 	* Moved splitting to the front.
> 	* Cleanups
> 
> V2:
> 	* Incorporated changes from David's cleanup
> 	* Now flushing with IDTE_NODAT for protection transfers.
> 	* Added RRBE huge page handling for g2 -> g3 skey emulation
> 	* Added documentation for capability
> 	* Renamed GMAP_ENTRY_* constants
> 	* Added SEGMENT hardware bits constants
> 	* Improved some patch descriptions
> 	* General small improvements
> 	* Introduced pte_from_pmd function
> 
> Dominik Dingel (2):
>   s390/mm: clear huge page storage keys on enable_skey
>   s390/mm: hugetlb pages within a gmap can not be freed
> 
> Janosch Frank (7):
>   s390/mm: Abstract gmap notify bit setting
>   s390/mm: Introduce gmap_pmdp_xchg
>   s390/mm: add gmap pmd invalidation notification
>   s390/mm: Add gmap pmd invalidation and clearing
>   s390/mm: Add huge page dirty sync support
>   s390/mm: Add huge pmd storage key handling
>   s390/mm: Enable gmap huge pmd support
> 
>  Documentation/virtual/kvm/api.txt   |  16 +
>  arch/s390/include/asm/gmap.h        |  23 ++
>  arch/s390/include/asm/mmu.h         |   2 +
>  arch/s390/include/asm/mmu_context.h |   1 +
>  arch/s390/include/asm/pgtable.h     |  17 +-
>  arch/s390/kvm/kvm-s390.c            |  57 +++-
>  arch/s390/mm/gmap.c                 | 612 ++++++++++++++++++++++++++++++++++--
>  arch/s390/mm/pageattr.c             |   6 +-
>  arch/s390/mm/pgtable.c              | 182 ++++++++++-
>  include/uapi/linux/kvm.h            |   1 +
>  10 files changed, 855 insertions(+), 62 deletions(-)
> 


-- 

Thanks,

David / dhildenb



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux