Re: [Linaro-mm-sig] Re: [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

Paul Cercueil <paul@xxxxxxxxxxxxxxx> · Tue, 30 Jan 2024 10:48:23 +0100

Hi Christian,

(Your email software is configured for HTML btw)

Le mardi 30 janvier 2024 à 10:23 +0100, Christian König a écrit :
>  Am 30.01.24 um 10:01 schrieb Daniel Vetter:
>  
> >  
> > On Fri, Jan 26, 2024 at 05:42:50PM +0100, Christian König wrote:
> >  
> > > [SNIP] 
> > > Well I think we should have some solution, but I'm not sure if it
> > > should be
> > > part of DMA-buf.
> > > 
> > > Essentially a DMA-buf exports the buffers as he uses it and the
> > > importer (or
> > > the DMA-buf subsystem) is then the one who says ok I can use this
> > > or I can't
> > > use this or I need to call extra functions to use this or
> > > whatever.
> > > 
> > > It's not the job of the exporter to provide the coherency for the
> > > importer,
> > > cause otherwise we would have a lot of code in the exporter which
> > > can only
> > > be tested when you have the right importer around. And I strongly
> > > think that
> > > this is a no-go for having a reliable solution.
> > >  
> >  
> > The trouble is, that if you have other memory than stuff allocated
> > by the
> > dma-api or mapped using the dma-api, then by necessity the exporter
> > has to
> > deal with this.
> >  
>  
>  Yes, I was thinking about that as well.
>  
>  
> >  
> > Which is the exact same reason we also force the exporters to deal
> > with
> > the cpu cache flushing - you're argument that it's not great to
> > replicate
> > this everywhere holds there equally.
> >  
>  
>  And I'm not really happy with that either.
>  
>  
> >  
> > The other thing is that right now the exporter is the only one who
> > actually knows what kind of dma coherency rules apply for a certain
> > piece
> > of memory. E.g. on i915-gem even if it's dma_map_sg mapped the
> > underlying
> > i915-gem buffer might be non-coherent, and i915-gem makes it all
> > work by
> > doing the appropriate amount of clflush.
> >  
>  
>  Yeah, exactly that's the reason why I think that this stuff doesn't
> belong into exporters/drivers.
>  
>  Looking at what kind of hacks and workarounds we have in both amdgpu
> as well as i915 it's pretty clear that we need to improve this design
> somehow.
>  
>  
> >  
> > Similar funky things happen in other cases.
> > 
> > So unless we add an interface which allows importers to figure out
> > how
> > much flushing is needed, currently the exporter is the only one who
> > knows
> > (because it can inspect the struct device at dma_buf_attach time).
> > 
> > We could flip this around, but it would be a rather serious
> > depature from
> > the dma-buf design approach thus far.
> >  
>  
>  Well clients already give the DMA-direction to exporters when
> creating the mapping and get an appropriate sg_table in return.
>  
>  All we need to do is getting the information what flushing is needed
> into the object returned here so that the DMA API can work with it.
>  
>  Christoph Hellwig pretty much nailed it when he said that the
> problem with the sg_table is that it mixes input and output
> parameters of the DMA-API.
>  
>  I would extend that and say that we need a mapping object the DMA-
> API can work with so that it can know what needs to be done when
> devices request that data is made coherent between them or the CPU.
>  
>  
> >  
> > >  
> > > That's why I think the approach of having DMA-buf callbacks is
> > > most likely
> > > the wrong thing to do.
> > > 
> > > What should happen instead is that the DMA subsystem provides
> > > functionality
> > > which to devices which don't support coherency through it's
> > > connection to
> > > say I want to access this data, please make sure to flush the
> > > appropriate
> > > catches. But that's just a very very rough design idea.
> > > 
> > > This will become more with CXL at the horizon I think.
> > >  
> >  
> > Yeah CXL will make this all even more fun, but we are firmly there
> > already
> > with devices deciding per-buffer (or sometimes even per-access with
> > intel's MOCS stuff) what coherency mode to use for a buffer.
> > 
> > Also arm soc generally have both coherent and non-coherent device
> > interconnects, and I think some devices can switch with runtime
> > flags too
> > which mode they use for a specific transition.
> > 
> > CXL just extends this to pcie devices.
> > 
> > So the mess is here, how do we deal with it?
> >  
>  
>  I would say we start with the DMA-API by getting away from sg_tables
> to something cleaner and state oriented. 

FYI I am already adding a 'dma_vec' object in my IIO DMABUF patchset,
which is just a dead simple

struct dma_vec {
  dma_addr_t addr;
  size_t len;
};

(The rationale for introducing it in the IIO DMABUF patchset was that
the "scatterlist" wouldn't allow me to change the transfer size.)

So I believe a new "sg_table"-like could just be an array of struct
dma_vec + flags.

Cheers,
-Paul

> >  
> > 
> > My take is that the opt-in callback addition is far from great, but
> > it's
> > in line with how we extended dma-buf the past decade plus too. So
> > unless
> > someone's volunteering to pour some serious time into re-
> > engineering this
> > all (including testing all the different device/driver<-
> > >device/driver
> > interactions) I think there's only really one other option: To not
> > support
> > these cases at all. And I don't really like that, because it means
> > people
> > will hack together something even worse in their drivers.
> > 
> > By adding it to dma-buf it'll stare us in our faces at least :-/
> >  
>  
>  Yeah, it's the way of the least resistance. But with CXL at the
> horizon and more and more drivers using it I think it's predictable
> that this will sooner or later blow up.
>  
>  Cheers,
>  Christian. 
>  
>  
> >  
> > 
> > Cheers, Sima
> > 
> >  
> > >  
> > > Regards,
> > > Christian.
> > > 
> > >  
> > > >  
> > > > Cheers, Sima
> > > >  
> > >  
> > > _______________________________________________
> > > Linaro-mm-sig mailing list -- linaro-mm-sig@xxxxxxxxxxxxxxxx
> > > To unsubscribe send an email to linaro-mm-sig-
> > > leave@xxxxxxxxxxxxxxxx
> > >  
> >   
>  
>