在 2024-07-03星期三的 23:11 -0700,Christoph Hellwig写道: > On Thu, Jul 04, 2024 at 10:00:52AM +0800, Icenowy Zheng wrote: > > So I here want to ask a question as an individual hacker: what's > > the > > policy of linux-pci towards these non-coherent PCIe > > implementations? > > > > If the sentences of Christian is right, these implementations are > > just > > out-of-spec, should them get purged out of the kernel, or at least > > raising a warning that some HW won't work because of inconformant > > implementation? > > Nothing in the PCIe specifications that mandates a programming model. > Non-coherent DMA is extremely common in lower end devices, and > despite > all the issues that it causes well supported in Linux. > > What are you trying to solve? Currently the DRM TTM subsystem (and GPU drivers using it) will assume coherency and fail on these non-coherent systems with cryptic error messages (like `[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx test failed (-110)`) without mentioning coherency issues at all. My original patchset tries to solve this problem by make the TTM subsystem sensible of coherency status (and prevent CPU-side cached mapping when non-coherent), but got argued by TTM maintainer and the maintainer says TTM's ignorance on non-coherent systems is intentional. >