Xieming, This is the second time I fix email addresses for you. Next time, I simply won't bother replying. On Fri, 01 Apr 2022 10:08:28 +0100, xieming <xieming@xxxxxxxxxx> wrote: > > when passthrough some pcie device, such as gpus(including > Nvidia and AMD),kvm will report:"Unsupported FSC: EC=0x24 > xFSC=0x21 ESR_EL2=0x92000061" err.the main reason is vfio I have asked you to describe how you get there, and you still haven't bothered replying. > ioremap vga memory type by DEVICE_nGnRnE, and kvm setting > memory type to PAGE_S2_DEVICE(DEVICE_nGnRE), but in guestos, > all of device io memory type when ioremapping (including gpu > driver TTM memory type) is setting to MT_NORMAL_NC. > > according to ARM64 stage1&stage2 conbining rules. > memory type attributes combining rules: > Normal-WB<Normal-WT<NormalNC<Device-GRE<Device-nGRE< > DevicenGnRE<Device-nGnRnE > Normal-WB is weakest,Device-nGnRnE is strongest. > > refferring to 'Arm Architecture Reference Manual Armv8, > for Armv8-A architecture profile' pdf, chapter B2.8 > refferring to 'ARM System Memory Management Unit Architecture > Specification SMMU architecture version 3.0 and version 3.1' pdf, > chapter 13.1.5 > > therefore, the I/O memory attribute of the VM is setting to > DevicenGnRE maybe is a mistake. it causes all device memory > accessing in the virtual machine must be aligned. > > To summarize: stage2 memory type cannot be stronger than stage1 > in arm64 archtechture. You are plain wrong. It can, and most of the time, it *must*. > > Signed-off-by: xieming <xieming@xxxxxxxxxx> > --- > arch/arm/include/asm/kvm_mmu.h | 3 ++- > arch/arm64/include/asm/kvm_mmu.h | 3 ++- > arch/arm64/include/asm/memory.h | 4 +++- > arch/arm64/include/asm/pgtable-prot.h | 2 +- > drivers/vfio/pci/vfio_pci.c | 7 +++++++ > virt/kvm/arm/mmu.c | 19 ++++++++++++++++--- > virt/kvm/arm/vgic/vgic-v2.c | 2 +- > 7 files changed, 32 insertions(+), 8 deletions(-) > > diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h > index 523c499e42db..5c7869d25b62 100644 > --- a/arch/arm/include/asm/kvm_mmu.h > +++ b/arch/arm/include/asm/kvm_mmu.h This file has been removed from the tree *over two years ago*. > @@ -64,7 +64,8 @@ void stage2_unmap_vm(struct kvm *kvm); > int kvm_alloc_stage2_pgd(struct kvm *kvm); > void kvm_free_stage2_pgd(struct kvm *kvm); > int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, > - phys_addr_t pa, unsigned long size, bool writable); > + phys_addr_t pa, unsigned long size, > + bool writable, bool writecombine); > > int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run); > > diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h > index b2558447c67d..3f98286c7498 100644 > --- a/arch/arm64/include/asm/kvm_mmu.h > +++ b/arch/arm64/include/asm/kvm_mmu.h > @@ -158,7 +158,8 @@ void stage2_unmap_vm(struct kvm *kvm); > int kvm_alloc_stage2_pgd(struct kvm *kvm); > void kvm_free_stage2_pgd(struct kvm *kvm); > int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, > - phys_addr_t pa, unsigned long size, bool writable); > + phys_addr_t pa, unsigned long size, > + bool writable, bool writecombine); NAK. For a start, there is no such thing as 'write-combine' in the ARM architecture, and I'm not convinced you can equate WC to Normal-NC. See the previous discussion at [1]. [1] https://lore.kernel.org/r/20210429162906.32742-1-sdonthineni@xxxxxxxxxx [...] > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index 51b791c750f1..6f66efb71743 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -1452,7 +1452,14 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma) > } > > vma->vm_private_data = vdev; > +#ifdef CONFIG_ARM64 > + if (vfio_pci_is_vga(pdev)) > + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > + else > + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); No. That's completely unacceptable. Who says that some VGA (who the hell implements VGA these days?) implies any sort of attribute other than device memory? This may work for your particular device under your own circumstances. Can it be generalised? No. And as Jason pointed out, this is likely to break userspace. > +#else > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > +#endif > vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff; > > /* > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c > index 11103b75c596..a46a58696834 100644 > --- a/virt/kvm/arm/mmu.c > +++ b/virt/kvm/arm/mmu.c > @@ -206,6 +206,17 @@ static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp) > dsb(ishst); > } > > +/** > + * is_vma_write_combine - check if VMA is mapped with writecombine or not > + * Return true if VMA mapped with MT_NORMAL_NC otherwise fasle > + */ > +static inline bool is_vma_write_combine(struct vm_area_struct *vma) > +{ > + pteval_t pteval = pgprot_val(vma->vm_page_prot); > + > + return ((pteval & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL_NC)); > +} Again, you are making tons of assumptions here, none of which are acceptable as is. M. -- Without deviation from the norm, progress is not possible. _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm