On 2021-02-04 03:16, Will Deacon wrote:
On Tue, Feb 02, 2021 at 11:56:27AM +0530, Sai Prakash Ranjan wrote:
On 2021-02-01 23:50, Jordan Crouse wrote:
> On Mon, Feb 01, 2021 at 08:20:44AM -0800, Rob Clark wrote:
> > On Mon, Feb 1, 2021 at 3:16 AM Will Deacon <will@xxxxxxxxxx> wrote:
> > > On Fri, Jan 29, 2021 at 03:12:59PM +0530, Sai Prakash Ranjan wrote:
> > > > On 2021-01-29 14:35, Will Deacon wrote:
> > > > > On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan wrote:
> > > > > > +#define IOMMU_LLC (1 << 6)
> > > > >
> > > > > On reflection, I'm a bit worried about exposing this because I think it
> > > > > will
> > > > > introduce a mismatched virtual alias with the CPU (we don't even have a
> > > > > MAIR
> > > > > set up for this memory type). Now, we also have that issue for the PTW,
> > > > > but
> > > > > since we always use cache maintenance (i.e. the streaming API) for
> > > > > publishing the page-tables to a non-coheren walker, it works out.
> > > > > However,
> > > > > if somebody expects IOMMU_LLC to be coherent with a DMA API coherent
> > > > > allocation, then they're potentially in for a nasty surprise due to the
> > > > > mismatched outer-cacheability attributes.
> > > > >
> > > >
> > > > Can't we add the syscached memory type similar to what is done on android?
> > >
> > > Maybe. How does the GPU driver map these things on the CPU side?
> >
> > Currently we use writecombine mappings for everything, although there
> > are some cases that we'd like to use cached (but have not merged
> > patches that would give userspace a way to flush/invalidate)
> >
>
> LLC/system cache doesn't have a relationship with the CPU cache. Its
> just a
> little accelerator that sits on the connection from the GPU to DDR and
> caches
> accesses. The hint that Sai is suggesting is used to mark the buffers as
> 'no-write-allocate' to prevent GPU write operations from being cached in
> the LLC
> which a) isn't interesting and b) takes up cache space for read
> operations.
>
> Its easiest to think of the LLC as a bonus accelerator that has no cost
> for
> us to use outside of the unfortunate per buffer hint.
>
> We do have to worry about the CPU cache w.r.t I/O coherency (which is a
> different hint) and in that case we have all of concerns that Will
> identified.
>
For mismatched outer cacheability attributes which Will mentioned, I
was
referring to [1] in android kernel.
I've lost track of the conversation here :/
When the GPU has a buffer mapped with IOMMU_LLC, is the buffer also
mapped
into the CPU and with what attributes? Rob said "writecombine for
everything" -- does that mean ioremap_wc() / MEMREMAP_WC?
Rob answered this.
Finally, we need to be careful when we use the word "hint" as
"allocation
hint" has a specific meaning in the architecture, and if we only
mismatch on
those then we're actually ok. But I think IOMMU_LLC is more than just a
hint, since it actually drives eviction policy (i.e. it enables
writeback).
Sorry for the pedantry, but I just want to make sure we're all talking
about the same things!
Sorry for the confusion which probably was caused by my mentioning of
android, NWA(no write allocate) is an allocation hint which we can
ignore
for now as it is not introduced yet in upstream.
Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member
of Code Aurora Forum, hosted by The Linux Foundation