On 05/07/2013 03:29 PM, David Gibson wrote: > On Mon, May 06, 2013 at 05:25:56PM +1000, Alexey Kardashevskiy wrote: >> This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT >> and H_STUFF_TCE requests without passing them to QEMU, which should >> save time on switching to QEMU and back. >> >> Both real and virtual modes are supported - whenever the kernel >> fails to handle TCE request, it passes it to the virtual mode. >> If it the virtual mode handlers fail, then the request is passed >> to the user mode, for example, to QEMU. >> >> This adds a new KVM_CAP_SPAPR_TCE_IOMMU ioctl to asssociate >> a virtual PCI bus ID (LIOBN) with an IOMMU group, which enables >> in-kernel handling of IOMMU map/unmap. >> >> This adds a special case for huge pages (16MB). The reference >> counting cannot be easily done for such pages in real mode (when >> MMU is off) so we added a list of huge pages. It is populated in >> virtual mode and get_page is called just once per a huge page. >> Real mode handlers check if the requested page is huge and in the list, >> then no reference counting is done, otherwise an exit to virtual mode >> happens. The list is released at KVM exit. At the moment the fastest >> card available for tests uses up to 9 huge pages so walking through this >> list is not very expensive. However this can change and we may want >> to optimize this. >> >> This also adds the virt_only parameter to the KVM module >> for debug and performance check purposes. >> >> Tests show that this patch increases transmission speed from 220MB/s >> to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card). >> >> Cc: David Gibson <david@xxxxxxxxxxxxxxxxxxxxx> >> Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxxxx> >> Signed-off-by: Paul Mackerras <paulus@xxxxxxxxx> >> --- >> Documentation/virtual/kvm/api.txt | 28 ++++ >> arch/powerpc/include/asm/kvm_host.h | 2 + >> arch/powerpc/include/asm/kvm_ppc.h | 2 + >> arch/powerpc/include/uapi/asm/kvm.h | 7 + >> arch/powerpc/kvm/book3s_64_vio.c | 242 ++++++++++++++++++++++++++++++++++- >> arch/powerpc/kvm/book3s_64_vio_hv.c | 192 +++++++++++++++++++++++++++ >> arch/powerpc/kvm/powerpc.c | 12 ++ >> include/uapi/linux/kvm.h | 2 + >> 8 files changed, 485 insertions(+), 2 deletions(-) >> >> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt >> index f621cd6..2039767 100644 >> --- a/Documentation/virtual/kvm/api.txt >> +++ b/Documentation/virtual/kvm/api.txt >> @@ -2127,6 +2127,34 @@ written, then `n_invalid' invalid entries, invalidating any previously >> valid entries found. >> >> >> +4.79 KVM_CREATE_SPAPR_TCE_IOMMU >> + >> +Capability: KVM_CAP_SPAPR_TCE_IOMMU >> +Architectures: powerpc >> +Type: vm ioctl >> +Parameters: struct kvm_create_spapr_tce_iommu (in) >> +Returns: 0 on success, -1 on error >> + >> +This creates a link between IOMMU group and a hardware TCE (translation >> +control entry) table. This link lets the host kernel know what IOMMU >> +group (i.e. TCE table) to use for the LIOBN number passed with >> +H_PUT_TCE, H_PUT_TCE_INDIRECT, H_STUFF_TCE hypercalls. >> + >> +/* for KVM_CAP_SPAPR_TCE_IOMMU */ >> +struct kvm_create_spapr_tce_iommu { >> + __u64 liobn; >> + __u32 iommu_id; > > Wouldn't it be more in keeping pardon? >> + __u32 flags; >> +}; >> + >> +No flag is supported at the moment. >> + >> +When the guest issues TCE call on a liobn for which a TCE table has been >> +registered, the kernel will handle it in real mode, updating the hardware >> +TCE table. TCE table calls for other liobns will cause a vm exit and must >> +be handled by userspace. >> + >> + >> 5. The kvm_run structure >> ------------------------ >> >> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h >> index 36ceb0d..2b70cbc 100644 >> --- a/arch/powerpc/include/asm/kvm_host.h >> +++ b/arch/powerpc/include/asm/kvm_host.h >> @@ -178,6 +178,8 @@ struct kvmppc_spapr_tce_table { >> struct kvm *kvm; >> u64 liobn; >> u32 window_size; >> + bool virtmode_only; > > I see this is now initialized from the global parameter, but I think > it would be better to just check the global (debug) parameter > directly, rather than duplicating it here. The global parameter is in kvm.ko and the struct above is in the real mode part which cannot go to the module. >> + struct iommu_group *grp; /* used for IOMMU groups */ >> struct page *pages[0]; >> }; >> >> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h >> index d501246..bdfa140 100644 >> --- a/arch/powerpc/include/asm/kvm_ppc.h >> +++ b/arch/powerpc/include/asm/kvm_ppc.h >> @@ -139,6 +139,8 @@ extern void kvmppc_xics_free(struct kvm *kvm); >> >> extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, >> struct kvm_create_spapr_tce *args); >> +extern long kvm_vm_ioctl_create_spapr_tce_iommu(struct kvm *kvm, >> + struct kvm_create_spapr_tce_iommu *args); >> extern struct kvmppc_spapr_tce_table *kvmppc_find_tce_table( >> struct kvm_vcpu *vcpu, unsigned long liobn); >> extern long kvmppc_emulated_h_put_tce(struct kvmppc_spapr_tce_table *stt, >> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h >> index 681b314..b67d44b 100644 >> --- a/arch/powerpc/include/uapi/asm/kvm.h >> +++ b/arch/powerpc/include/uapi/asm/kvm.h >> @@ -291,6 +291,13 @@ struct kvm_create_spapr_tce { >> __u32 window_size; >> }; >> >> +/* for KVM_CAP_SPAPR_TCE_IOMMU */ >> +struct kvm_create_spapr_tce_iommu { >> + __u64 liobn; >> + __u32 iommu_id; >> + __u32 flags; >> +}; >> + >> /* for KVM_ALLOCATE_RMA */ >> struct kvm_allocate_rma { >> __u64 rma_size; >> diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c >> index 643ac1e..98cf949 100644 >> --- a/arch/powerpc/kvm/book3s_64_vio.c >> +++ b/arch/powerpc/kvm/book3s_64_vio.c >> @@ -27,6 +27,9 @@ >> #include <linux/hugetlb.h> >> #include <linux/list.h> >> #include <linux/anon_inodes.h> >> +#include <linux/pci.h> >> +#include <linux/iommu.h> >> +#include <linux/module.h> >> >> #include <asm/tlbflush.h> >> #include <asm/kvm_ppc.h> >> @@ -38,10 +41,19 @@ >> #include <asm/kvm_host.h> >> #include <asm/udbg.h> >> #include <asm/iommu.h> >> +#include <asm/tce.h> >> + >> +#define DRIVER_VERSION "0.1" >> +#define DRIVER_AUTHOR "Paul Mackerras, IBM Corp. <paulus@xxxxxxxxxxx>" >> +#define DRIVER_DESC "POWERPC KVM driver" > > Really? What is wrong here? -- Alexey -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html