Hi Dongjiu Geng, On 30/04/17 06:37, Dongjiu Geng wrote: > when happen SEA, deliver signal bus and handle the ioctl that > inject SEA abort to guest, so that guest can handle the SEA error. > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c > index 105b6ab..a96594f 100644 > --- a/arch/arm/kvm/mmu.c > +++ b/arch/arm/kvm/mmu.c > @@ -20,8 +20,10 @@ > @@ -1238,6 +1240,36 @@ static void coherent_cache_guest_page(struct kvm_vcpu *vcpu, kvm_pfn_t pfn, > __coherent_cache_guest_page(vcpu, pfn, size); > } > > +static void kvm_send_signal(unsigned long address, bool hugetlb, bool hwpoison) > +{ > + siginfo_t info; > + > + info.si_signo = SIGBUS; > + info.si_errno = 0; > + if (hwpoison) > + info.si_code = BUS_MCEERR_AR; > + else > + info.si_code = 0; > + > + info.si_addr = (void __user *)address; > + if (hugetlb) > + info.si_addr_lsb = PMD_SHIFT; > + else > + info.si_addr_lsb = PAGE_SHIFT; > + > + send_sig_info(SIGBUS, &info, current); > +} > + Punit reviewed the other version of this patch, this PMD_SHIFT is not the right thing to do, it needs a more accurate set of calls and shifts as there may be hugetlbfs pages other than PMD_SIZE. https://www.spinics.net/lists/arm-kernel/msg568919.html I haven't posted a new version of that patch because I was still hunting a bug in the hugepage/hwpoison code, even with Punit's fixes series I see -EFAULT returned to userspace instead of this hwpoison code being invoked. Please avoid duplicating functionality between patches, it wastes reviewers time, especially when we know there are problems with this approach. > +static void kvm_handle_bad_page(unsigned long address, > + bool hugetlb, bool hwpoison) > +{ > + /* handle both hwpoison and other synchronous external Abort */ > + if (hwpoison) > + kvm_send_signal(address, hugetlb, true); > + else > + kvm_send_signal(address, hugetlb, false); > +} Why the extra level of indirection? We only want to signal userspace like this from KVM for hwpoison. Signals for RAS related reasons should come from the bits of the kernel that decoded the error. (hwpoison for KVM is a corner case as Qemu's memory effectively has two users, Qemu and KVM. This isn't the example of how user-space gets signalled.) > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > index b37446a..780e3c4 100644 > --- a/arch/arm64/kvm/guest.c > +++ b/arch/arm64/kvm/guest.c > @@ -277,6 +277,13 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, > return -EINVAL; > } > > +int kvm_vcpu_ioctl_sea(struct kvm_vcpu *vcpu) > +{ > + kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu)); > + > + return 0; > +} > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index bb02909..1d2e2e7 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -1306,6 +1306,7 @@ struct kvm_s390_ucas_mapping { > #define KVM_S390_GET_IRQ_STATE _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state) > /* Available with KVM_CAP_X86_SMM */ > #define KVM_SMI _IO(KVMIO, 0xb7) > +#define KVM_ARM_SEA _IO(KVMIO, 0xb8) > > #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) > #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) > Why do we need a userspace API for SEA? It can also be done by using KVM_{G,S}ET_ONE_REG to change the vcpu registers. The advantage of doing it this way is you can choose which ESR value to use. Adding a new API call to do something you could do with an old one doesn't look right. Thanks, James