On 2017-10-25 08:31, Alex Williamson wrote:
On Wed, 25 Oct 2017 07:16:46 +1100
geoff--- via iommu <iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
I have isolated it to a single change, although I do not completely
understand what other implications it might have.
By just changing the line in `init_vmcb` that reads:
save->g_pat = svm->vcpu.arch.pat;
To:
save->g_pat = 0x0606060606060606;
This enables write back and performance jumps through the roof.
This needs someone with more experience to write a proper patch that
addresses this in a smarter way rather then just hard coding the
value.
This patch looks like an attempt to fix this issue but it yields no
detectable performance gains.
https://patchwork.kernel.org/patch/6748441/
Any takers?
IOMMU is not the right list for such a change. I'm dubious this is
correct since you're basically going against the comment immediately
previous in the code, but perhaps it's a hint in the right direction.
Thanks,
Alex
As am I, which is why it needs someone with more experience to figure
out
why this has had such a huge impact. I have been testing everything
since
I made that change and I am finding that everything I throw at it works
at near native performance.
I will post my findings to the KVM mailing list as it is clearly a KVM
issue with SVM, perhaps someone there can write a patch to fix this, or
at the very least allow for a workaround/quirk module parameter.
On 2017-10-25 06:08, geoff@xxxxxxxxxxxxxxx wrote:
> I have identified the issue! With NPT enabled I am now getting near
> bare
> metal performance with PCI pass through. The issue was with some stubs
> that have not been properly implemented. I will clean my code up and
> submit a patch shortly.
>
> This is a 10 year old bug that has only become evident with the recent
> ability to perform PCI pass-through with dedicated graphics cards. I
> would expect this to improve performance across most workloads that use
> AMD NPT.
>
> Here are some benchmarks to show what I am getting in my dev
> environment:
>
> https://www.3dmark.com/3dm/22878932
> https://www.3dmark.com/3dm/22879024
>
> -Geoff
>
>
> On 2017-10-24 16:15, geoff@xxxxxxxxxxxxxxx wrote:
>> Further to this I have verified that IOMMU is working fine, traces and
>> additional printk's added to the kernel module were used to check. All
>> accesses are successful and hit the correct addresses.
>>
>> However profiling under Windows shows there might be an issue with
>> IRQs
>> not reaching the guest. When FluidMark is running at 5fps I still see
>> excellent system responsiveness with the CPU 90% idle and the GPU load
>> at 6%.
>>
>> When switching PhysX to CPU mode the GPU enters low power mode,
>> indicating that the card is no longer in use. This would seem to
>> confirm that the GPU is indeed in use by the PhysX API correctly.
>>
>> My assumption now is that the IRQs from the video card are getting
>> lost.
>>
>> I could be completely off base here but at this point it seems like
>> the
>> best way to proceed unless someone cares to comment.
>>
>> -Geoff
>>
>>
>> On 2017-10-24 10:49, geoff@xxxxxxxxxxxxxxx wrote:
>>> Hi,
>>>
>>> I realize this is an older thread but I have spent much of today
>>> trying to
>>> diagnose the problem.
>>>
>>> I have discovered how to reliably reproduce the problem with very
>>> little effort.
>>> It seems that reproducing the issue has been hit and miss for people
>>> as it seems
>>> to primarily affect games/programs that make use of nVidia PhysX. My
>>> understanding of npt's inner workings is quite primitive but I have
>>> still spent
>>> much of my time trying to diagnose the fault and identify the cause.
>>>
>>> Using the free program FluidMark[1] it is possible to reproduce the
>>> issue, where
>>> on a GTX 1080Ti the rendering rate drops to around 4 fps with npt
>>> turned on, but
>>> if turned off the render rate is in excess of 60fps.
>>>
>>> I have produced traces for with and without ntp enabled during these
>>> tests which
>>> I can provide if it will help. So far I have been digging through how
>>> npt works
>>> and trying to glean as much information as I can from the source and
>>> the AMD
>>> specifications but much of this and how mmu works is very new to me
>>> so progress
>>> is slow.
>>>
>>> If anyone else has looked into this and has more information to share
>>> I would be
>>> very interested.
>>>
>>> Kind Regards,
>>> Geoffrey McRae
>>> HostFission
>>> https://hostfission.com
>>>
>>>
>>> [1]:
>>> http://www.geeks3d.com/20130308/fluidmark-1-5-1-physx-benchmark-fluid-sph-simulation-opengl-download/
_______________________________________________
iommu mailing list
iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/iommu