Paolo Bonzini <pbonzini@xxxxxxxxxx> writes: > On 03/03/20 14:02, Vitaly Kuznetsov wrote: >> Right you are, >> >> a big hammer like >> >> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h >> index 2a8f2bd..52c9bce 100644 >> --- a/arch/x86/include/asm/kvm_emulate.h >> +++ b/arch/x86/include/asm/kvm_emulate.h >> @@ -324,14 +324,6 @@ struct x86_emulate_ctxt { >> */ >> >> /* current opcode length in bytes */ >> - u8 opcode_len; >> - u8 b; >> - u8 intercept; >> - u8 op_bytes; >> - u8 ad_bytes; >> - struct operand src; >> - struct operand src2; >> - struct operand dst; >> union { >> int (*execute)(struct x86_emulate_ctxt *ctxt); >> fastop_t fop; >> @@ -343,6 +335,14 @@ struct x86_emulate_ctxt { >> * or elsewhere >> */ >> bool rip_relative; >> + u8 opcode_len; >> + u8 b; >> + u8 intercept; >> + u8 op_bytes; >> + u8 ad_bytes; >> + struct operand src; >> + struct operand src2; >> + struct operand dst; >> u8 rex_prefix; >> u8 lock_prefix; >> u8 rep_prefix; >> >> seems to make the issue go away. (For those wondering why fielf >> shuffling makes a difference: init_decode_cache() clears >> [rip_relative, modrm) range) How did this even work before... >> (I'm still looking at the code, stay tuned...) > > On AMD, probably because all these instructions were normally trapped by L1. > > Of these, however, most need not be zeroed again. op_bytes, ad_bytes, > opcode_len and b are initialized by x86_decode_insn, and dst/src/src2 > also by decode_operand. So only intercept is affected, adding > "ctxt->intercept = x86_intercept_none" should be enough. This matches my findings, thank you! Patch[es] are coming. -- Vitaly