On Mon, Mar 27, 2023, at 14:48, Robin Murphy wrote: > On 2023-03-27 13:13, Arnd Bergmann wrote: >> >> [ HELP NEEDED: can anyone confirm that it is a correct assumption >> on arm that a cache-coherent device writing to a page always results >> in it being in a PG_dcache_clean state like on ia64, or can a device >> write directly into the dcache?] > > In AMBA at least, if a snooping write hits in a cache then the data is > most likely going to get routed directly into that cache. If it has > write-back write-allocate attributes it could also land in any cache > along its normal path to RAM; it wouldn't have to go all the way. > > Hence all the fun we have where treating a coherent device as > non-coherent can still be almost as broken as the other way round :) Ok, thanks for the information. I'm still not sure whether this can result in the situation where PG_dcache_clean is wrong though. Specifically, the question is whether a DMA to a coherent buffer can end up in a dirty L1 dcache of one core and require to write back the dcache before invalidating the icache for that page. On ia64, this is not the case, the optimization here is to only flush the icache after a coherent DMA into an executable user page, while Arm only does this for noncoherent DMA but not coherent DMA. >From your explanation it sounds like this might happen, even though that would mean that "coherent" DMA is slightly less coherent than it is elsewhere. To be on the safe side, I'd have to pass a flag into arch_dma_mark_clean() about coherency, to let the arm implementation still require the extra dcache flush for coherent DMA, while ia64 can ignore that flag. Arnd