On Wed, Jan 29, 2020 at 08:24:41PM +0000, Beata Michalska wrote: > On ARMv7 & ARMv8 some load/store instructions might trigger a data abort > exception with no valid ISS info to be decoded. The lack of decode info > makes it at least tricky to emulate those instruction which is one of the > (many) reasons why KVM will not even try to do so. > > Add support for handling those by requesting KVM to inject external > dabt into the quest. > > Signed-off-by: Beata Michalska <beata.michalska@xxxxxxxxxx> > --- > target/arm/cpu.h | 2 ++ > target/arm/kvm.c | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > target/arm/kvm32.c | 3 ++ > target/arm/kvm64.c | 3 ++ > target/arm/kvm_arm.h | 19 +++++++++++ > 5 files changed, 123 insertions(+) > > diff --git a/target/arm/cpu.h b/target/arm/cpu.h > index c1aedbe..e04a8d3 100644 > --- a/target/arm/cpu.h > +++ b/target/arm/cpu.h > @@ -558,6 +558,8 @@ typedef struct CPUARMState { > uint8_t has_esr; > uint64_t esr; > } serror; > + /* Status field for pending external dabt */ > + uint8_t ext_dabt_pending; > > /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */ > uint32_t irq_line_state; > diff --git a/target/arm/kvm.c b/target/arm/kvm.c > index 8d82889..e7bc9b7 100644 > --- a/target/arm/kvm.c > +++ b/target/arm/kvm.c > @@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = { > > static bool cap_has_mp_state; > static bool cap_has_inject_serror_esr; > +static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */ nit: the KVM_CAP_ARM_INJECT_EXT_DABT comment is unnecessary > > static ARMHostCPUFeatures arm_host_cpu_features; > > @@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs) > KVM_CAP_ARM_INJECT_SERROR_ESR); > } > > +void kvm_arm_init_ext_dabt_injection(CPUState *cs) > +{ > + cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state, > + KVM_CAP_ARM_INJECT_EXT_DABT); > +} > + > bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try, > int *fdarray, > struct kvm_vcpu_init *init) > @@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s) > ret = -EINVAL; > } > > + if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER)) > + if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) { > + warn_report("Failed to enable DABT NISV cap"); > + } > + Missing {} around the outer block. As KVM_CAP_ARM_INJECT_EXT_DABT is a VM capability then I think we should set cap_has_inject_ext_dabt here, like cap_has_mp_state gets set. I see you've followed the pattern used for cap_has_inject_serror_esr, but that looks wrong too since KVM_CAP_ARM_INJECT_SERROR_ESR is also a VM capability. The way it is now we just keep setting cap_has_inject_serror_esr to the same value, NR_VCPUS times. > return ret; > } > > @@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu) > events.exception.serror_esr = env->serror.esr; > } > > + if (cap_has_inject_ext_dabt) { > + events.exception.ext_dabt_pending = env->ext_dabt_pending; > + } > + > ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events); > if (ret) { > error_report("failed to put vcpu events"); > @@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu) > env->serror.has_esr = events.exception.serror_has_esr; > env->serror.esr = events.exception.serror_esr; > > + env->ext_dabt_pending = events.exception.ext_dabt_pending; > + afaict from Documentation/virt/kvm/api.txt and the KVM code you cannot get this state. Therefore the above line (and extra stray blank line) should be dropped. > return 0; > } > > @@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run) > { > } > > + stray blank line > MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run) > { > ARMCPU *cpu; > @@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run) > ret = EXCP_DEBUG; > } /* otherwise return to guest */ > break; > + case KVM_EXIT_ARM_NISV: > + /* External DABT with no valid iss to decode */ > + ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss, > + run->arm_nisv.fault_ipa); > + break; > default: > qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n", > __func__, run->exit_reason); > @@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data) > { > return (data - 32) & 0xffff; > } > + > +int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss, > + uint64_t fault_ipa) > +{ > + ARMCPU *cpu = ARM_CPU(cs); > + CPUARMState *env = &cpu->env; > + uint32_t ins, ins_fetched; ins_fetched is a bool > + > + /* > + * Hacky workaround for kernels that for aarch32 guests, instead of expected > + * external data abort, inject the IMPLEMENTATION DEFINED exception with the > + * lock-down. This is actually handled by the guest which results in > + * re-running the faulting instruction. > + * This intends to break the vicious cycle. > + */ This doesn't seem like the right thing to do. What happens on real hardware in this case? If an OS would get into a "vicious cycle" on real hardware then it should get into one on KVM too. > + if (!is_a64(env)) { > + static uint8_t setback; > + > + /* > + * The state has not been synchronized yet, so if this is re-occurrence > + * of the same abort triggered by guest, the status for pending external > + * abort should not get cleared yet > + */ > + if (unlikely(env->ext_dabt_pending)) { > + if (setback) { > + error_report("Most probably triggered kernel issue with" > + " injecting external data abort."); > + error_printf("Giving up trying ...\n"); > + /* Here is where the fun comes to an end */ > + return -EINVAL; > + } > + } > + setback = env->ext_dabt_pending; > + } > + > + kvm_cpu_synchronize_state(cs); > + /* > + * ISS [23:14] is invalid so there is a limited info > + * on what has just happened so the only *useful* thing that can > + * be retrieved from ISS is WnR & DFSC (though in some cases WnR > + * might be less of a value as well) > + */ > + > + /* > + * Get current PC before it will get updated to exception vector entry > + */ > + target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15]; ins_addr should be declared above But what are we doing? pc is a guest virtual address. Oh, I see... > + > + /* > + * Get the faulting instruction > + * Instructions have a fixed length of 32 bits > + * and are always little-endian > + */ > + ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins, > + sizeof(uint32_t), 0); ... we're trying to actual walk the KVM guest's page tables. That seems a bit odd, and the "_debug" function name used for it makes it seem even more odd. > + > + error_report("Data abort exception with no valid ISS generated by " > + "guest memory access at physical address: 0x" TARGET_FMT_lx, > + (target_ulong)fault_ipa); > + > + error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n" : "%s\n", > + "KVM unable to emulate faulting instruction", > + (!ins_fetched ? 0 : ldl_le_p(&ins))); > + error_printf("Injecting a data abort exception into guest.\n"); The guest shouldn't be able to generate three lines of errors on the host whenever it wants. That's a security bug. One trace point here seems like the most we should do. Or, after these three lines we should kill the guest. Actually, I don't think we should attempt to get the instruction at all. We should do what the KVM documenation suggests we do when we get this exit. We should either do a user selected action: one of suspend, dump, restart, or inject a dabt (as is done below). The last choice hopes that the guest will handle it in some graceful way, or that it'll crash with all the information needed for a post-mortem crash investigation. And I don't think we should do the "very brave" option of trying to emulate the instruction, even if we identify it as a valid MMIO address. That just sounds like a huge can of worms. Does QEMU already have a way for users to select a don't-know-how-to-handle-guest-exception behavior? > + /* > + * Set pending ext dabt and trigger SET_EVENTS so that > + * KVM can inject the abort > + */ > + env->ext_dabt_pending = 1; > + > + return 0; > +} Thanks, drew _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm