Re: [PATCH 0/2][RFC] postcopy migration: Linux char device for postcopy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13 January 2012 02:15, Isaku Yamahata <yamahata@xxxxxxxxxxxxx> wrote:
> One more question.
> Does your architecture/implementation (in theory) allow KVM memory
> features like swap, KSM, THP?

* Swap: Yes we support swap to disk ( the page is pulled from swap
before being send over), swap process do its job on the other side.
* KSM :  same , we support KSM, the KSMed page is broken down and
split and they are send individually ( yes sub optimal but make the
protocol less messy) and we let the KSM daemon do its job on the other
side.
* THP : more sticky here. Due to time constraint we decided that we
will be partially supporting it. What does it means: if we encounter
THP we break them down in standard page granularity as it is our
current memory unit we are manipulating. As a result you can have THP
on the source but you won't have THP on the other side.
           _ Note , we didn't explore fully the ramification of THP
with RDMA, i don't know if THP play well with the MMU of HW RDMA NIC,
One thing i would like to explore is if it is possible to break down
the THP in standard page and then reassemble them on the other side (
do any one fo you know if it is possible to aggregate page to for a
THP in kernel ? )
* cgroup  :  should be transparently working, but we need to do more
testing to confirm that .




>
>
> On Fri, Jan 13, 2012 at 11:03:23AM +0900, Isaku Yamahata wrote:
>> Very interesting. We can cooperate for better (postcopy) live migration.
>> The code doesn't seem available yet, I'm eager for it.
>>
>>
>> On Fri, Jan 13, 2012 at 01:09:30AM +0000, Benoit Hudzia wrote:
>> > Hi,
>> >
>> > Sorry to jump to hijack the thread  like that , however i would like
>> > to just to inform you  that we recently achieve a milestone out of the
>> > research project I'm leading. We enhanced KVM in order to deliver
>> > post copy live migration using RDMA at kernel level.
>> >
>> > Few point on the architecture of the system :
>> >
>> > * RDMA communication engine in kernel ( you can use soft iwarp or soft
>> > ROCE if you don't have hardware acceleration, however we also support
>> > standard RDMA enabled NIC) .
>>
>> Do you mean infiniband subsystem?
>>
>>
>> > * Naturally Page are transferred with Zerop copy protocol
>> > * Leverage the async page fault system.
>> > * Pre paging / faulting
>> > * No context switch as everything is handled within kernel and using
>> > the page fault system.
>> > * Hybrid migration ( pre + post copy) available
>>
>> Ah, I've been also planing this.
>> After pre-copy phase, is the dirty bitmap sent?
>>
>> So far I've thought naively that pre-copy phase would be finished by the
>> number of iterations. On the other hand your choice is timeout of
>> pre-copy phase. Do you have rationale? or it was just natural for you?
>>
>>
>> > * Rely on an independent Kernel Module
>> > * No modification to the KVM kernel Module
>> > * Minimal Modification to the Qemu-Kvm code
>> > * We plan to add the page prioritization algo in order to optimise the
>> > pre paging algo and background transfer
>>
>> Where do you plan to implement? in qemu or in your kernel module?
>> This algo could be shared.
>>
>> thanks in advance.
>>
>> > You can learn a little bit more and see a demo here:
>> > http://tinyurl.com/8xa2bgl
>> > I hope to be able to provide more detail on the design soon. As well
>> > as more concrete demo of the system ( live migration of VM running
>> > large  enterprise apps such as ERP or In memory DB)
>> >
>> > Note: this is just a step stone as the post copy live migration mainly
>> > enable us to validate the architecture design and  code.
>> >
>> > Regards
>> > Benoit
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > Regards
>> > Benoit
>> >
>> >
>> > On 12 January 2012 13:59, Avi Kivity <avi@xxxxxxxxxx> wrote:
>> > > On 01/04/2012 05:03 AM, Isaku Yamahata wrote:
>> > >> Yes, it's quite doable in user space(qemu) with a kernel-enhancement.
>> > >> And it would be easy to convert a separated daemon process into a thread
>> > >> in qemu.
>> > >>
>> > >> I think it should be done out side of qemu process for some reasons.
>> > >> (I just repeat same discussion at the KVM-forum because no one remembers
>> > >> it)
>> > >>
>> > >> - ptrace (and its variant)
>> > >> ?? Some people want to investigate guest ram on host (qemu stopped or lively).
>> > >> ?? For example, enhance crash utility and it will attach qemu process and
>> > >> ?? debug guest kernel.
>> > >
>> > > To debug the guest kernel you don't need to stop qemu itself. ?? I agree
>> > > it's a problem for qemu debugging though.
>> > >
>> > >>
>> > >> - core dump
>> > >> ?? qemu process may core-dump.
>> > >> ?? As postmortem analysis, people want to investigate guest RAM.
>> > >> ?? Again enhance crash utility and it will read the core file and analyze
>> > >> ?? guest kernel.
>> > >> ?? When creating core, the qemu process is already dead.
>> > >
>> > > Yes, strong point.
>> > >
>> > >> It precludes the above possibilities to handle fault in qemu process.
>> > >
>> > > I agree.
>> > >
>> > >
>> > > --
>> > > error compiling committee.c: too many arguments to function
>> > >
>> > > --
>> > > To unsubscribe from this list: send the line "unsubscribe kvm" in
>> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> > > More majordomo info at ??http://vger.kernel.org/majordomo-info.html
>> >
>> >
>> >
>> > --
>> > " The production of too many useful things results in too many useless people"
>> >
>>
>> --
>> yamahata
>>
>
> --
> yamahata



-- 
" The production of too many useful things results in too many useless people"
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux