On Tue, Nov 26, 2019 at 07:59:49PM -0800, Hugh Dickins wrote: > On Mon, 25 Nov 2019, Bharata B Rao wrote: > > > On PEF-enabled POWER platforms that support running of secure guests, > > secure pages of the guest are represented by device private pages > > in the host. Such pages needn't participate in KSM merging. This is > > achieved by using ksm_madvise() call which need to be exported > > since KVM PPC can be a kernel module. > > > > Signed-off-by: Bharata B Rao <bharata@xxxxxxxxxxxxx> > > Acked-by: Paul Mackerras <paulus@xxxxxxxxxx> > > Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> > > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > > I can say > Acked-by: Hugh Dickins <hughd@xxxxxxxxxx> > to this one. > > But not to your 2/7 which actually makes use of it: because sadly it > needs down_write(&kvm->mm->mmap_sem) for the case when it switches off > VM_MERGEABLE in vma->vm_flags. That's frustrating, since I think it's > the only operation for which down_read() is not good enough. Oh ok! Thanks for pointing this out. > > I have no idea how contended that mmap_sem is likely to be, nor how > many to-be-secured pages that vma is likely to contain: you might find > it okay simply to go with it down_write throughout, or you might want > to start out with it down_read, and only restart with down_write (then > perhaps downgrade_write later) when you see VM_MERGEABLE is set. Using down_write throughtout is not easy as we do migrate_vma_pages() from fault path (->migrate_to_ram()) too. Here we come with down_read already held. Starting with down_read and restarting with down_write if VM_MERGEABLE is set -- this also looks a bit difficult as we will have challenges with locking order if we release mmap_sem in between and re-acquire. So I think I will start with down_write in this particular case and will downgrade_write as soon as ksm_madvise() is complete. > > The crash you got (thanks for the link): that will be because your > migrate_vma_pages() had already been applied to a page that was > already being shared via KSM. > > But if these secure pages are expected to be few and far between, > maybe you'd prefer to keep VM_MERGEABLE, and add per-page checks > of some kind into mm/ksm.c, to skip over these surprising hybrids. I did bail out from a few routines in mm/ksm.c with is_device_private_page(page) check, but that wasn't good enough and I encountered crashes in different code paths. Guess a bit more understanding of KSM internals would be required before retrying that. However since all the pages of the guest except for a few will be turned into secure pages early during boot, it appears better if secure guests don't participate in in KSM merging at all. Regards, Bharata.