https://bugzilla.kernel.org/show_bug.cgi?id=196409 Bug ID: 196409 Summary: kvm_amd nested pagetable gpu passthrough performance oddities Product: Virtualization Version: unspecified Kernel Version: 4.10.8-1 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: kvm Assignee: virtualization_kvm@xxxxxxxxxxxxxxxxxxxx Reporter: efeu@xxxxxxxxxx Regression: No The hardware I was testing with: AMD Ryzen R7 1700 Gigabyte GA-AX370-Gaming 5 different GPUs Windows10 x64 Guest But this bug is reproducable on AMD FX series too. While community is discussing for a while about this bug, I haven't found it here. Based on our discussion here: http://www.spinics.net/lists/kvm/msg149446.html While npt is enabled a passed through GPU is giving much less performance as expected. Here some test results the community already did: First Heaven benchmark with ultra settings on 1920x1080: - DirectX 11: - npt=0: 87.0 fps - npt=1: 78.4 fps (10% drop) - DirectX 9: - npt=0: 100.0 fps - npt=1: 66.4 fps (33% drop) - OpenGL: - npt=0: 82.5 fps - npt=1: 35.2 fps (58% drop) Heaven Benchmark again, this time with low settings on 1280x720: - DirectX 11: - npt=0: 182.5 fps - npt=1: 140.1 fps (25% drop) - DirectX 9: - npt=0: 169.2 fps - npt=1: 74.1 fps (56% drop) - OpenGL: - npt=0: 202.8 fps - npt=1: 45.0 fps (78% drop) PerformanceTest 9.0 3d benchmark: - DirectX 9: - npt=0: 157 fps - npt=1: 13 fps (92% drop) - DirectX 10: - npt=0: 220 fps - npt=1: 212 fps (4% drop) - DirectX 11: - npt=0: 234 fps - npt=1: 140 fps (40% drop) - DirectX 12: - npt=0: 88 fps (scored 35 because of the penalized FPS of not being able to run at 4k) - npt=1: 4.5 fps (scored 1, 95% drop) - GPU Compute: - Mandel: - npt=0: ~= 2000 fps - npt=1: ~= 2000 fps - Bitonic Sort: - npt=0: ~= 153583696.0 elements/sec - npt=1: ~= 106233376.0 elements/sec (31% drop) - QJulia4D: - npt=0: ~= 1000 fps - npt=1: ~= 1000 fps - OpenCL: - npt=0: ~= 750 fps - npt=1: ~= 220 fps Some more data from 3DMark benchmarks: Time Spy(DirectX 12): - Graphics test 1: - npt=0: 37.65 FPS - npt=1: 24.22 FPS (36% drop) - Graphics test 2: - npt=0: 33.05 FPS - npt=1: 29.65 FPS (10% drop) - CPU test: - npt=0: 17.35 FPS - npt=1: 12.03 FPS (31% drop) Fire Strike(DirectX 11): - Graphics test 1: - npt=0: 80.56 FPS - npt=1: 41.89 FPS (49% drop) - Graphics test 2: - npt=0: 70.64 FPS - npt=1: 60.75 FPS (14% drop) - Physics test: - npt=0: 50.14 FPS - npt=1: 5.78 FPS (89% drop) - Combined test: - npt=0: 32.83 FPS - npt=1: 17.70 FPS (47% drop) Sky Diver(DirectX 11): - Graphics test 1: - npt=0: 248.81 FPS - npt=1: 173.63 FPS (31% drop) - Graphics test 2: - npt=0: 250.49 FPS - npt=1: 124.84 FPS (51% drop) - Physics test: - 8 threads: - npt=0: 140.93 FPS - npt=1: 119.08 FPS (15% drop) - 24 threads: - npt=0: 110.22 FPS - npt=1: 74.55 FPS (33% drop) - 48 threads: - npt=1: 71.56 FPS - npt=1: 45.93 FPS (36% drop) - 96 threads: - npt=0: 41.04 FPS - npt=1: 24.81 FPS (40% drop) - Combined test: - npt=0: 75.65 FPS - npt=1: 50.45 FPS (33% drop) I compared the performance with XEN and found out there is no performance impact, so the bug should be in the nested pagetable implementation in kvm_amd module and not a hardware related issue in AMD-Vi. -- You are receiving this mail because: You are watching the assignee of the bug.