On Mon, May 19, 2014 at 11:22:11AM +0200, Lucas Stach wrote: > Am Montag, den 19.05.2014, 11:02 +0200 schrieb Thierry Reding: > > On Mon, May 19, 2014 at 04:10:58PM +0900, Alexandre Courbot wrote: > > > Some architectures (e.g. ARM) need the CPU buffers to be explicitely > > > flushed for a memory write to take effect. Not doing so results in > > > synchronization issues, especially after writing to BOs. > > > > It seems to me that the above is generally true for all architectures, > > not just ARM. > > > No, on PCI coherent arches, like x86 and some PowerPCs, the GPU will > snoop the CPU caches and therefore an explicit cache flush is not > required. I was criticizing the wording in the commit message. Perhaps it could be enhanced with what you just said. > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c > > [...] > > > index 0886f47e5244..b9c9729c5733 100644 > > > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > > > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > > > @@ -437,8 +437,10 @@ nouveau_bo_wr16(struct nouveau_bo *nvbo, unsigned index, u16 val) > > > mem = &mem[index]; > > > if (is_iomem) > > > iowrite16_native(val, (void __force __iomem *)mem); > > > - else > > > + else { > > > *mem = val; > > > + nv_cpu_cache_flush_area(mem, 2); > > > + } > > > } > > > > > > u32 > > > @@ -461,8 +463,10 @@ nouveau_bo_wr32(struct nouveau_bo *nvbo, unsigned index, u32 val) > > > mem = &mem[index]; > > > if (is_iomem) > > > iowrite32_native(val, (void __force __iomem *)mem); > > > - else > > > + else { > > > *mem = val; > > > + nv_cpu_cache_flush_area(mem, 4); > > > + } > > > > This looks rather like a sledgehammer to me. Effectively this turns nvbo > > into an uncached buffer. With additional overhead of constantly flushing > > caches. Wouldn't it make more sense to locate the places where these are > > called and flush the cache after all the writes have completed? > > > I don't think the explicit flushing for those things makes sense. I > think it is a lot more effective to just map the BOs write-combined on > PCI non-coherent arches. This way any writes will be buffered. Reads > will be slow, but I don't think nouveau is reading back a lot from those > buffers. > Using the write-combining buffer doesn't need any additional > synchronization as it will get flushed on pushbuf kickoff anyways. Sounds good to me. Thierry
Attachment:
pgpdRhZWOMWhP.pgp
Description: PGP signature