On Mon, 21 Jan 2019 at 19:24, Michel Dänzer <michel@xxxxxxxxxxx> wrote: > > On 2019-01-21 7:20 p.m., Ard Biesheuvel wrote: > > On Mon, 21 Jan 2019 at 19:04, Michel Dänzer <michel@xxxxxxxxxxx> wrote: > >> > >> On 2019-01-21 6:59 p.m., Ard Biesheuvel wrote: > >>> On Mon, 21 Jan 2019 at 18:55, Michel Dänzer <michel@xxxxxxxxxxx> wrote: > >>>> > >>>> On 2019-01-21 5:30 p.m., Ard Biesheuvel wrote: > >>>>> On Mon, 21 Jan 2019 at 17:22, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > >>>>> > >>>>>> Until that happens we should just change the driver ifdefs to default > >>>>>> the hacks to off and only enable them on setups where we 100% > >>>>>> positively know that they actually work. And document that fact > >>>>>> in big fat comments. > >>>>> > >>>>> Well, as I mentioned in my commit log as well, if we default to off > >>>>> unless CONFIG_X86, we may break working setups on MIPS and Power where > >>>>> the device is in fact non-cache coherent, and relies on this > >>>>> 'optimization' to get things working. > >>>> > >>>> FWIW, the amdgpu driver doesn't rely on non-snooped transfers for > >>>> correct basic operation (the scenario Christian brought up is a very > >>>> specialized use-case), so that shouldn't be an issue. > >>>> > >>> > >>> The point is that this is only true for x86. > >>> > >>> On other architectures, the use of non-cached mappings on the CPU side > >>> means that you /do/ rely on non-snooped transfers, since if those > >>> transfers turn out not to snoop inadvertently, the accesses are > >>> incoherent with the CPU's view of memory. > >> > >> The driver generally only uses non-cached mappings if > >> drm_arch/device_can_wc_memory returns true. > >> > > > > Indeed. And so we should take care to only return 'true' from that > > function if it is guaranteed that non-cached CPU mappings are coherent > > with the mappings used by the GPU, either because that is always the > > case (like on x86), or because we know that the platform in question > > implements NoSnoop correctly throughout the interconnect. > > > > What seems to be complicating matters is that in some cases, the > > device is non-cache coherent to begin with, so regardless of whether > > the NoSnoop attribute is used or not, those accesses will not snoop in > > the caches and be coherent with the non-cached mappings used by the > > CPU. So if we restrict this optimization [on non-X86] to platforms > > that are known to implement NoSnoop correctly, we may break platforms > > that are implicitly NoSnoop all the time. > > Since the driver generally doesn't rely on non-snooped accesses for > correctness, that couldn't "break" anything that hasn't always been broken. > Again, that is only true on x86. On other architectures, DMA writes from the device may allocate in the caches, and be invisible to the CPU when it uses non-cached mappings. _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx