RE: RFC: Split EPT huge pages in advance of dirty logging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Peter Feiner [mailto:pfeiner@xxxxxxxxxx]
> Sent: Saturday, February 22, 2020 8:19 AM
> To: Junaid Shahid <junaids@xxxxxxxxxx>
> Cc: Ben Gardon <bgardon@xxxxxxxxxx>; Zhoujian (jay)
> <jianjay.zhou@xxxxxxxxxx>; Peter Xu <peterx@xxxxxxxxxx>;
> kvm@xxxxxxxxxxxxxxx; qemu-devel@xxxxxxxxxx; pbonzini@xxxxxxxxxx;
> dgilbert@xxxxxxxxxx; quintela@xxxxxxxxxx; Liujinsong (Paul)
> <liu.jinsong@xxxxxxxxxx>; linfeng (M) <linfeng23@xxxxxxxxxx>; wangxin (U)
> <wangxinxin.wang@xxxxxxxxxx>; Huangweidong (C)
> <weidong.huang@xxxxxxxxxx>
> Subject: Re: RFC: Split EPT huge pages in advance of dirty logging
> 
> On Fri, Feb 21, 2020 at 2:08 PM Junaid Shahid <junaids@xxxxxxxxxx> wrote:
> >
> > On 2/20/20 9:34 AM, Ben Gardon wrote:
> > >
> > > FWIW, we currently do this eager splitting at Google for live
> > > migration. When the log-dirty-memory flag is set on a memslot we
> > > eagerly split all pages in the slot down to 4k granularity.
> > > As Jay said, this does not cause crippling lock contention because
> > > the vCPU page faults generated by write protection / splitting can
> > > be resolved in the fast page fault path without acquiring the MMU lock.
> > > I believe +Junaid Shahid tried to upstream this approach at some
> > > point in the past, but the patch set didn't make it in. (This was
> > > before my time, so I'm hoping he has a link.) I haven't done the
> > > analysis to know if eager splitting is more or less efficient with
> > > parallel slow-path page faults, but it's definitely faster under the
> > > MMU lock.
> > >
> >
> > I am not sure if we ever posted those patches upstream. Peter Feiner would
> know for sure. One notable difference in what we do compared to the approach
> outlined by Jay is that we don't rely on tdp_page_fault() to do the splitting. So
> we don't have to create a dummy VCPU and the specialized split function is also
> much faster.

I'm curious and interested in the way you implemented, especially you mentioned
that the performance is much faster without a dummy VCPU.

> We've been carrying these patches since 2015. I've never posted them.
> Getting them in shape for upstream consumption will take some work. I can
> look into this next week.

It will be nice if you're going to post it to the upstream.

Regards,
Jay Zhou

> 
> Peter




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux