Re: [PATCH] x86/kvm: Account for fpstate->user_xfeatures changes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 14, 2023 at 10:05:57AM -0700, Dongli Zhang wrote:
> 
> 
> On 9/14/23 2:11 AM, Tyler Stachecki wrote:
> > On Thu, Sep 14, 2023 at 04:15:54AM -0300, Leonardo Bras wrote:
> >> So, IIUC, the xfeatures from the source guest will be different than the 
> >> xfeatures of the target (destination) guest. Is that correct?
> > 
> > Correct.
> >  
> >> It does not seem right to me. I mean, from the guest viewpoint, some 
> >> features will simply vanish during execution, and this could lead to major 
> >> issues in the guest.
> 
> I fully agree with this.
> 
> I think the original commit ad856280ddea ("x86/kvm/fpu: Limit guest
> user_xfeatures to supported bits of XCR0") is for the source server, not
> destination server.
> 
> That is:
> 
> 1. Without the commit (src and dst), something bad may happen.
> 
> 2. With the commit on src, issue is fixed.
> 
> 3. With the commit only dst, it is expected that issue is not fixed.
> 
> Therefore, from administrator's perspective, the bugfix should always be applied
> no the source server, in order to succeed the migration.
> 
> 
> BTW, we may not be able to use commit ad856280ddea in the Fixes tag.
> 
> > 
> > My assumption is that the guest CPU model should confine access to registers
> > that make sense for that (guest) CPU.
> > 
> > e.g., take a host CPU capable of AVX-512 running a guest CPU model that only
> > has AVX-256. If the guest suddenly loses the top 256 bits of %zmm*, it should
> > not really be perceivable as %ymm architecturally remains unchanged.
> > 
> > Though maybe I'm being too rash here? Is there a case where this assumption
> > breaks down?
> > 
> >> The idea here is that if the target (destination) host can't provide those 
> >> features for the guest, then migration should fail.
> >>
> >> I mean, qemu should fail the migration, and that's correct behavior.
> >> Is it what is happening?
> > 
> > Unfortunately, no, it is not... and that is biggest concern right now.
> > 
> > I do see some discussion between Peter and you on this topic and see that
> > there was an RFC to implement such behavior stemming from it, here:
> > https://lore.kernel.org/qemu-devel/20220607230645.53950-1-peterx@xxxxxxxxxx/
> 
> I agree that bug is at QEMU side, not KVM side.
> 
> It is better to improve at QEMU side.

I agree fixing QEMU is the best solution we have.

> 
> 4508 int kvm_arch_put_registers(CPUState *cpu, int level)
> 4509 {
> 4510     X86CPU *x86_cpu = X86_CPU(cpu);
> 4511     int ret;
> ... ...
> 4546     ret = kvm_put_xsave(x86_cpu);
> 4547     if (ret < 0) {
> 4548         return ret;
> 4549     }
> ... ...--> the rest of kvm_arch_put_registers() won't execute !!!
> 
> Thank you very much!
> 
> Dongli Zhang
> 
> > 
> > ... though I do not believe that work ever landed in the tree. Looking at
> > qemu's master branch now, the error from kvm_arch_put_registers is just
> > discarded in do_kvm_cpu_synchronize_post_init...
> > 
> > ```
> > static void do_kvm_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg)
> > {
> >     kvm_arch_put_registers(cpu, KVM_PUT_FULL_STATE);
> >     cpu->vcpu_dirty = false;
> > }
> > ```
> > 
> > Best,
> > Tyler
> 




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux