Re: [PATCH] drm/amdgpu: Use uncached ioremap() for LoongArch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 06.03.23 um 04:01 schrieb WANG Xuerui:
On 2023/3/6 10:49, Huacai Chen wrote:
Hi, Christian,

On Mon, Mar 6, 2023 at 12:40 AM Christian König
<christian.koenig@xxxxxxx> wrote:

Am 05.03.23 um 06:21 schrieb Huacai Chen:
LoongArch maintains cache coherency in hardware, but its WUC attribute
(Weak-ordered UnCached, which is similar to WC) is out of the scope of
cache coherency machanism. This means WUC can only used for write-only
memory regions. So use uncached ioremap() for LoongArch in the amdgpu
driver.

Well NAK. This is leaking platform dependencies into the driver code.
Then is it acceptable to let ioremap() depend on drm_arch_can_wc_memory()?

Note: he's likely meaning "is it acceptable to use drm_arch_can_wc_memory() to decide between ioremap() and ioremap_wc()".

Although I doubt it's acceptable to you (driver) folks either, because while drm_arch_can_wc_memory() does isolate platform details from driver proper, it's still papering over platform PCIe violation in VRAM domain. Still better than having platform defines though.

Well agree on the PCIe violations, but drm_arch_can_wc_memory() is just for a completely different use case.

drm_arch_can_wc_memory() checks if system memory can be accessed write combined as well (which is a PCIe extension) or needs to be accessed with caching enabled (which is a core PCIe requirement).

So completely different topic and no using this here is not acceptable either.

The key point is that when WUC only works with write only mappings you *can't* use that to implement ioremap_wc().

Also making use of drm_arch_can_wc_memory might fix this fdo issue [1] on aarch64 too (where I replied earlier). It seems people simply can't stop inventing such micro-architectures sadly...

I don't think that will help for this bug. WC on iomem is known to work correctly on aarch64 and well tested. What doesn't work is using WC on system memory.

And in the case of aarch64 it's not a core issue with the platform, but rather that some hw implements get it right and some get it wrong.

I already had an in deep discussion with ARM folks about that and it seems that some hw implementations think they can combine the core IP with some cheap PCIe root complex and it just magically works.

Regards,
Christian.


[1]: https://gitlab.freedesktop.org/drm/amd/-/issues/2313


When you have a limitation that ioremap_wc() can't guarantee read/write
ordering then that's pretty clearly a platform bug and you would need to
apply this workaround to all drivers using ioremap_wc() which isn't
really feasible.


I agree in this case perhaps all of ioremap_wc() usages would have to degrade into ioremap() for correctness on such platforms. In which case amdgpu wouldn't have to be individually called out / touched anyway. Whether this is easily doable/upstreamable is another question though...





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux