On Wed, Sep 25, 2024, Keith Busch wrote: > On Sun, Sep 22, 2024 at 05:57:05AM -0700, Sean Christopherson wrote: > > On Tue, Aug 20, 2024, Keith Busch wrote: > > > To test, I executed the following program against a qemu emulated pci > > > device resource. Prior to this kernel patch, it would fail with > > > > > > traps: vmovdq[378] trap invalid opcode ip:4006b2 sp:7ffe2f5bb680 error:0 in vmovdq[6b2,400000+1000] > > > > ... > > > > > +static const struct gprefix pfx_avx_0f_6f_0f_7f = { > > > + N, I(Avx | Aligned, em_mov), N, I(Avx | Unaligned, em_mov), > > > +}; > > > + > > > +static const struct opcode avx_0f_table[256] = { > > > + /* 0x00 - 0x5f */ > > > + X16(N), X16(N), X16(N), X16(N), X16(N), X16(N), > > > + /* 0x60 - 0x6F */ > > > + X8(N), X4(N), X2(N), N, > > > + GP(SrcMem | DstReg | ModRM | Mov, &pfx_avx_0f_6f_0f_7f), > > > + /* 0x70 - 0x7F */ > > > + X8(N), X4(N), X2(N), N, > > > + GP(SrcReg | DstMem | ModRM | Mov, &pfx_avx_0f_6f_0f_7f), > > > + /* 0x80 - 0xFF */ > > > + X16(N), X16(N), X16(N), X16(N), X16(N), X16(N), X16(N), X16(N), > > > +}; > > > > Mostly as an FYI, we're likely going to run into more than just VMOVDQU sooner > > rather than later. E.g. gcc-13 with -march=x86-64-v3 (which per Vitaly is now > > the default gcc behavior for some distros[*]) compiles this chunk from KVM > > selftests' kvm_fixup_exception(): > > > > regs->rip = regs->r11; > > regs->r9 = regs->vector; > > regs->r10 = regs->error_code; > > > > intto this monstronsity (which is clever, but oof). > > > > 405313: c4 e1 f9 6e c8 vmovq %rax,%xmm1 > > 405318: 48 89 68 08 mov %rbp,0x8(%rax) > > 40531c: 48 89 e8 mov %rbp,%rax > > 40531f: c4 c3 f1 22 c4 01 vpinsrq $0x1,%r12,%xmm1,%xmm0 > > 405325: 49 89 6d 38 mov %rbp,0x38(%r13) > > 405329: c5 fa 7f 45 00 vmovdqu %xmm0,0x0(%rbp) > > > > I wouldn't be surprised if the same packing shenanigans get employed when generating > > code for a struct overlay of emulated MMIO. > > Thanks for the notice. I'm hoping we can proceed with just the mov > instructions for now, unless someone already has a real use for these on > emulated MMIO. Otherwise, we can cross that bridge when we get there. Oh, yeah, for sure. The FYI was really for Paolo, e.g. to make sure we don't make assumptions in the emulator or something and make our future lives harder (I haven't looked at your patch in any detail, so my fears could be completely unfounded). > As it is, if just the vmovdq[u,a] are okay, I have a follow on for > vmovdqu64, though I'm currently having trouble adding AVX-512 registers. > Simply increasing the size of the struct x86_emulate_ctxt appears to > break something even without trying to emulate those instructions. But I > want to wait to see if this first part is okay before spending too much > time on it.