Re: AMD Ryzen KVM/NPT/IOMMU issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 28, 2017 at 7:34 PM, Nick Sarnie <commendsarnex@xxxxxxxxx> wrote:
> Hi Suravee,
>
> Thanks a lot for helping. Torcs does not appear graphically demanding
> on modern hardware, so this issue may not be easily noticeable. I was
> able to easily reproduce the problem using the Unigine Heaven
> benchmark, but I'm sure anything moderately graphically demanding
> would show a performance loss with NPT enabled. As an example, when I
> tested this with Fedora on my RX480, I got around 30-35 FPS with NPT
> on and around 55-60 with NPT off.
>
> Let me know if you need any more information or have any questions.
>
> (no problem John, thanks a lot for taking interest in this)
>
> Thanks again,
> Sarnex

Hi

I don't think the FPS drop is proportional to how graphically demanding the
workload is. On the contrary, at first sight it would seem like the less
demanding a workload is, the bigger the FPS impact suffered, though as some
numbers I will show in a moment suggest, this is not always the case.

Unfortunately I haven't been able to find a pattern to what causes the most
impact in FPS except that the relative drop increases with higher FPS
values. Other
than that, it seems very specific to the workload/benchmark used.

Here's some data I've collected to help with the investigation. The system is
Ryzen 1700 (no overclock, 3ghz), GTX 1070, windows 10 guest.

I've used Unigine Heaven and Passmark's PerformanceTest 9.0.

First Heaven benchmark with ultra settings on 1920x1080:

- DirectX 11:
  - npt=0: 87.0 fps
  - npt=1: 78.4 fps (10% drop)
- DirectX 9:
  - npt=0: 100.0 fps
  - npt=1: 66.4 fps (33% drop)
- OpenGL:
  - npt=0: 82.5 fps
  - npt=1: 35.2 fps (58% drop)

Heaven Benchmark again, this time with low settings on 1280x720:

- DirectX 11:
  - npt=0: 182.5 fps
  - npt=1: 140.1 fps (25% drop)
- DirectX 9:
  - npt=0: 169.2 fps
  - npt=1: 74.1 fps (56% drop)
- OpenGL:
  - npt=0: 202.8 fps
  - npt=1: 45.0 fps (78% drop)

PerformanceTest 9.0 3d benchmark:

- DirectX 9:
  - npt=0: 157 fps
  - npt=1: 13 fps (92% drop)
- DirectX 10:
  - npt=0: 220 fps
  - npt=1: 212 fps (4% drop)
- DirectX 11:
  - npt=0: 234 fps
  - npt=1: 140 fps (40% drop)
- DirectX 12:
  - npt=0: 88 fps (scored 35 because of the penalized FPS of not being
able to run at 4k)
  - npt=1: 4.5 fps (scored 1, 95% drop)
- GPU Compute:
  - Mandel:
    - npt=0: ~= 2000 fps
    - npt=1: ~= 2000 fps
  - Bitonic Sort:
    - npt=0: ~= 153583696.0 elements/sec
    - npt=1: ~= 106233376.0 elements/sec (31% drop)
  - QJulia4D:
    - npt=0: ~= 1000 fps
    - npt=1: ~= 1000 fps
  - OpenCL:
    - npt=0: ~= 750 fps
    - npt=1: ~= 220 fps

As you can see, in some cases there's only about 5% drop(which could be within
the margin of error), while others the drop is as high as 95%. Some points of
interest:

- Passmark directx9 is not graphically demanding(runs at 1024x768, gtx 1070
  doesn't break a sweat) and suffers a 92% drop in FPS.
- Unigine directx11 on ultra is graphically demanding and suffers less than 10%
  drop in FPS.
- Passmark directx12 is graphically demanding and suffers 95% drop in FPS.
- The bitonic sort is not a graphical benchmark, it shows the results(avg number
  of sorted elements/sec) in a console window, yet it suffers 31% drop in
  performance.

I think it would take someone with experience in GPU programming, and with
knowledge of what each benchmark does, to find a pattern in these numbers.

Thiago



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux