Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 11, 2010 at 9:51 AM, Anthony Liguori <anthony@xxxxxxxxxxxxx> wrote:
> On 05/11/2010 09:53 AM, Avi Kivity wrote:
>>
>> On 05/11/2010 05:17 PM, Cam Macdonell wrote:
>>>
>>>> The master is the shared memory area.  It's a completely separate entity
>>>> that is represented by the backing file (or shared memory server handing
>>>> out
>>>> the fd to mmap).  It can exists independently of any guest.
>>>
>>> I think the master/peer idea would be necessary if we were sharing
>>> guest memory (sharing guest A's memory with guest B).  Then if the
>>> master (guest A) dies, perhaps something needs to happen to preserve
>>> the memory contents.
>>
>> Definitely.  But we aren't...
>
> Then transparent live migration is impossible.  IMHO, that's a fundamental
> mistake that we will regret down the road.
>
>>>   But since we're sharing host memory, the
>>> applications in the guests can race to determine the master by
>>> grabbing a lock at offset 0 or by using lowest VM ID.
>>>
>>> Looking at it another way, it is the applications using shared memory
>>> that may or may not need a master, the Qemu processes don't need the
>>> concept of a master since the memory belongs to the host.
>>
>> Exactly.  Furthermore, even in a master/slave relationship, there will be
>> different masters for different sub-areas, it would be a pity to expose all
>> this in the hardware abstraction.  This way we have an external device, and
>> PCI HBAs which connect to it - just like a multi-tailed SCSI disk.
>
> To support transparent live migration, it's necessary to do two things:
>
> 1) Preserve the memory contents of the PCI BAR after disconnected from a
> shared memory segment
> 2) Synchronize any changes made to the PCI BAR with the shared memory
> segment upon reconnect/initial connection.
>
> N.B. savevm/loadvm both constitute disconnect and reconnect events
> respectively.
>
> Supporting (1) is easy since we just need to memcpy() the contents of the
> shared memory segment to a temporary RAM area upon disconnect.
>
> Supporting (2) is easy when the shared memory segment is viewed as owned by
> the guest since it has the definitive copy of the data.  IMHO, this is what
> role=master means.  However, if we want to support a model where the guest
> does not have a definitive copy of the data, upon reconnect, we need to
> throw away the guest's changes and make the shared memory segment appear to
> simultaneously update to the guest.  This is what role=peer means.
>
> For role=peer, it's necessary to signal to the guest when it's not
> connected.  This means prior to savevm it's necessary to indicate to the
> guest that it's been disconnected.
>
> I think it's important that we build this mechanism in from the start
> because as I've stated in the past, I don't think role=peer is going to be
> the dominant use-case.  I actually don't think that shared memory between
> guests is all that interesting compared to shared memory to an external
> process on the host.
>

Most of the people I hear from who are using my patch are using a peer
model to share data between applications (simulations, JVMs, etc).
But guest-to-host applications work as well of course.

I think "transparent migration" can be achieved by making the
connected/disconnected state transparent to the application.

When using the shared memory server, the server has to be setup anyway
on the new host and copying the memory region could be part of that as
well if the application needs the contents preserved.  I don't think
it has to be handled by the savevm/loadvm operations.  There's little
difference between naming one VM the master or letting the shared
memory server act like a master.

I think abstractions on top of shared memory could handle
disconnection issues (sort of how TCP handles them for networks) if
the application needs it.  Again, my opinion is to leave it to the
application to decide what it necessary.

Cam
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux