Re: [for-6.0 v5 11/13] spapr: PEF: prevent migration

Christian Borntraeger <borntraeger@xxxxxxxxxx> · Thu, 14 Jan 2021 16:25:21 +0100

On 14.01.21 15:15, Daniel P. Berrangé wrote:
> On Thu, Jan 14, 2021 at 03:09:01PM +0100, Christian Borntraeger wrote:
>>
>>
>> On 14.01.21 15:04, Cornelia Huck wrote:
>>> On Thu, 14 Jan 2021 12:20:48 +0000
>>> Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote:
>>>
>>>> On Thu, Jan 14, 2021 at 12:50:12PM +0100, Christian Borntraeger wrote:
>>>>>
>>>>>
>>>>> On 14.01.21 12:45, Dr. David Alan Gilbert wrote:  
>>>>>> * Cornelia Huck (cohuck@xxxxxxxxxx) wrote:  
>>>>>>> On Thu, 14 Jan 2021 11:52:11 +0100
>>>>>>> Christian Borntraeger <borntraeger@xxxxxxxxxx> wrote:
>>>>>>>  
>>>>>>>> On 14.01.21 11:36, Dr. David Alan Gilbert wrote:  
>>>>>>>>> * Christian Borntraeger (borntraeger@xxxxxxxxxx) wrote:    
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 13.01.21 13:42, Dr. David Alan Gilbert wrote:    
>>>>>>>>>>> * Cornelia Huck (cohuck@xxxxxxxxxx) wrote:    
>>>>>>>>>>>> On Tue, 5 Jan 2021 12:41:25 -0800
>>>>>>>>>>>> Ram Pai <linuxram@xxxxxxxxxx> wrote:
>>>>>>>>>>>>    
>>>>>>>>>>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:    
>>>>>>>>>>>>>> On Mon, 4 Jan 2021 10:40:26 -0800
>>>>>>>>>>>>>> Ram Pai <linuxram@xxxxxxxxxx> wrote:    
>>>>>>>>>>>>    
>>>>>>>>>>>>>>> The main difference between my proposal and the other proposal is...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   In my proposal the guest makes the compatibility decision and acts
>>>>>>>>>>>>>>>   accordingly.  In the other proposal QEMU makes the compatibility
>>>>>>>>>>>>>>>   decision and acts accordingly. I argue that QEMU cannot make a good
>>>>>>>>>>>>>>>   compatibility decision, because it wont know in advance, if the guest
>>>>>>>>>>>>>>>   will or will-not switch-to-secure.
>>>>>>>>>>>>>>>       
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You have a point there when you say that QEMU does not know in advance,
>>>>>>>>>>>>>> if the guest will or will-not switch-to-secure. I made that argument
>>>>>>>>>>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
>>>>>>>>>>>>>> was to flip that property on demand when the conversion occurs. David
>>>>>>>>>>>>>> explained to me that this is not possible for ppc, and that having the
>>>>>>>>>>>>>> "securable-guest-memory" property (or whatever the name will be)
>>>>>>>>>>>>>> specified is a strong indication, that the VM is intended to be used as
>>>>>>>>>>>>>> a secure VM (thus it is OK to hurt the case where the guest does not
>>>>>>>>>>>>>> try to transition). That argument applies here as well.      
>>>>>>>>>>>>>
>>>>>>>>>>>>> As suggested by Cornelia Huck, what if QEMU disabled the
>>>>>>>>>>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled?
>>>>>>>>>>>>> Offcourse; this has to be done with a big fat warning stating
>>>>>>>>>>>>> "secure-guest-memory" feature is disabled on the machine.
>>>>>>>>>>>>> Doing so, will continue to support guest that do not try to transition.
>>>>>>>>>>>>> Guest that try to transition will fail and terminate themselves.    
>>>>>>>>>>>>
>>>>>>>>>>>> Just to recap the s390x situation:
>>>>>>>>>>>>
>>>>>>>>>>>> - We currently offer a cpu feature that indicates secure execution to
>>>>>>>>>>>>   be available to the guest if the host supports it.
>>>>>>>>>>>> - When we introduce the secure object, we still need to support
>>>>>>>>>>>>   previous configurations and continue to offer the cpu feature, even
>>>>>>>>>>>>   if the secure object is not specified.
>>>>>>>>>>>> - As migration is currently not supported for secured guests, we add a
>>>>>>>>>>>>   blocker once the guest actually transitions. That means that
>>>>>>>>>>>>   transition fails if --only-migratable was specified on the command
>>>>>>>>>>>>   line. (Guests not transitioning will obviously not notice anything.)
>>>>>>>>>>>> - With the secure object, we will already fail starting QEMU if
>>>>>>>>>>>>   --only-migratable was specified.
>>>>>>>>>>>>
>>>>>>>>>>>> My suggestion is now that we don't even offer the cpu feature if
>>>>>>>>>>>> --only-migratable has been specified. For a guest that does not want to
>>>>>>>>>>>> transition to secure mode, nothing changes; a guest that wants to
>>>>>>>>>>>> transition to secure mode will notice that the feature is not available
>>>>>>>>>>>> and fail appropriately (or ultimately, when the ultravisor call fails).
>>>>>>>>>>>> We'd still fail starting QEMU for the secure object + --only-migratable
>>>>>>>>>>>> combination.
>>>>>>>>>>>>
>>>>>>>>>>>> Does that make sense?    
>>>>>>>>>>>
>>>>>>>>>>> It's a little unusual; I don't think we have any other cases where
>>>>>>>>>>> --only-migratable changes the behaviour; I think it normally only stops
>>>>>>>>>>> you doing something that would have made it unmigratable or causes
>>>>>>>>>>> an operation that would make it unmigratable to fail.    
>>>>>>>>>>
>>>>>>>>>> I would like to NOT block this feature with --only-migrateable. A guest
>>>>>>>>>> can startup unprotected (and then is is migrateable). the migration blocker
>>>>>>>>>> is really a dynamic aspect during runtime.     
>>>>>>>>>
>>>>>>>>> But the point of --only-migratable is to turn things that would have
>>>>>>>>> blocked migration into failures, so that a VM started with
>>>>>>>>> --only-migratable is *always* migratable.    
>>>>>>>>
>>>>>>>> Hmmm, fair enough. How do we do this with host-model? The constructed model
>>>>>>>> would contain unpack, but then it will fail to startup? Or do we silently 
>>>>>>>> drop unpack in that case? Both variants do not feel completely right.   
>>>>>>>
>>>>>>> Failing if you explicitly specified unpacked feels right, but failing
>>>>>>> if you just used the host model feels odd. Removing unpack also is a
>>>>>>> bit odd, but I think the better option if we want to do anything about
>>>>>>> it at all.  
>>>>>>
>>>>>> 'host-model' feels a bit special; but breaking the rule that
>>>>>> only-migratable doesn't change behaviour is weird
>>>>>> Can you do host,-unpack   to make that work explicitly?  
>>>>>
>>>>> I guess that should work. But it means that we need to add logic in libvirt
>>>>> to disable unpack for host-passthru and host-model. Next problem is then,
>>>>> that a future version might implement migration of such guests, which means
>>>>> that libvirt must then stop fencing unpack.  
>>>>
>>>> The "host-model" is supposed to always be migratable, so we should
>>>> fence the feature there.
>>>>
>>>> host-passthrough is "undefined" whether it is migratable - it may or may
>>>> not work, no guarantees made by libvirt.
>>>>
>>>> Ultimately I think the problem is that there ought to be an explicit
>>>> config to enable the feature for s390, as there is for SEV, and will
>>>> also presumably be needed for ppc. 
>>>
>>> Yes, an explicit config is what we want; unfortunately, we have to deal
>>> with existing setups as well...
>>>
>>> The options I see are
>>> - leave things for existing setups as they are now (i.e. might become
>>>   unmigratable when the guest transitions), and make sure we're doing
>>>   the right thing with the new object
>>> - always make the unpack feature conflict with migration requirements;
>>>   this is a guest-visible change
>>>
>>> The first option might be less hairy, all considered?
>>
>> What about a libvirt change that removes the unpack from the host-model as 
>> soon as  only-migrateable is used. When that is in place, QEMU can reject
>> the combination of only-migrateable + unpack.
> 
> I think libvirt needs to just unconditionally remove unpack from host-model
> regardless, and require an explicit opt in. We can do that in libvirt
> without compat problems, because we track the expansion of "host-model"
> for existing running guests.

This is true for running guests, but not for shutdown and restart.

I would really like to avoid bad (and hard to debug) surprises that a guest boots
fine with libvirt version x and then fail with x+1. So at the beginning
I am fine with libvirt removing "unpack" from the default host model expansion
if the --only-migrateable parameter is used. Now I look into libvirt and I 
cannot actually find code that uses this parameter. Are there some patches
posted somewhere?
> 
> QEMU could introduce a deprecation warning right now, and then turn it into
> an error after the deprecation cycle is complete.