On 2019-01-21 7:20 p.m., Ard Biesheuvel wrote: > On Mon, 21 Jan 2019 at 19:04, Michel Dänzer <michel@xxxxxxxxxxx> wrote: >> >> On 2019-01-21 6:59 p.m., Ard Biesheuvel wrote: >>> On Mon, 21 Jan 2019 at 18:55, Michel Dänzer <michel@xxxxxxxxxxx> wrote: >>>> >>>> On 2019-01-21 5:30 p.m., Ard Biesheuvel wrote: >>>>> On Mon, 21 Jan 2019 at 17:22, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: >>>>> >>>>>> Until that happens we should just change the driver ifdefs to default >>>>>> the hacks to off and only enable them on setups where we 100% >>>>>> positively know that they actually work. And document that fact >>>>>> in big fat comments. >>>>> >>>>> Well, as I mentioned in my commit log as well, if we default to off >>>>> unless CONFIG_X86, we may break working setups on MIPS and Power where >>>>> the device is in fact non-cache coherent, and relies on this >>>>> 'optimization' to get things working. >>>> >>>> FWIW, the amdgpu driver doesn't rely on non-snooped transfers for >>>> correct basic operation (the scenario Christian brought up is a very >>>> specialized use-case), so that shouldn't be an issue. >>>> >>> >>> The point is that this is only true for x86. >>> >>> On other architectures, the use of non-cached mappings on the CPU side >>> means that you /do/ rely on non-snooped transfers, since if those >>> transfers turn out not to snoop inadvertently, the accesses are >>> incoherent with the CPU's view of memory. >> >> The driver generally only uses non-cached mappings if >> drm_arch/device_can_wc_memory returns true. >> > > Indeed. And so we should take care to only return 'true' from that > function if it is guaranteed that non-cached CPU mappings are coherent > with the mappings used by the GPU, either because that is always the > case (like on x86), or because we know that the platform in question > implements NoSnoop correctly throughout the interconnect. > > What seems to be complicating matters is that in some cases, the > device is non-cache coherent to begin with, so regardless of whether > the NoSnoop attribute is used or not, those accesses will not snoop in > the caches and be coherent with the non-cached mappings used by the > CPU. So if we restrict this optimization [on non-X86] to platforms > that are known to implement NoSnoop correctly, we may break platforms > that are implicitly NoSnoop all the time. Since the driver generally doesn't rely on non-snooped accesses for correctness, that couldn't "break" anything that hasn't always been broken. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx