Re: [PATCH] drm: return false in drm_arch_can_wc_memory() for ARM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 21, 2018 at 9:16 AM Liviu Dudau <Liviu.Dudau@xxxxxxx> wrote:
>
> On Thu, Dec 20, 2018 at 04:36:19PM +0100, Daniel Vetter wrote:
> > On Thu, Dec 20, 2018 at 09:56:57AM -0500, Alex Deucher wrote:
> > > I'm not familiar enough with ARM to know if write combining
> > > is actually an architectural limitation or if it's an issue
> > > with the PCIe IPs used on various platforms, but so far
> > > everyone that has tried to run radeon hardware on
> > > ARM has had to disable it.  So let's just make it official.
> >
> > wc on arm is Really Complicated (tm) afaiui. There's issues with aliasing
> > mappings and stuff, so you need to allocate your wc memory from special
> > pools. So probably best to just disable it until we figure this out.
>
> I believe both of you are conflating different issues under the wrong
> name. Write combining happens all the time with Arm, the ARMv8
> architecture is a weakly-ordered model of memory so hardware is allowed
> to re-order or combine memory access as they seem fit.
>
> A while ago I did run an AMD GPU card on my Juno dev board and it worked
> (for a very limited definition of worked, I've only validated the fact
> that I could get an fbcon and could run un-accelerated X11). So I would
> be interested if Alex could share some of the scenarios where people are
> seeing failures.

Here's an example:
https://bugs.freedesktop.org/show_bug.cgi?id=108625
But there are probably 5 or 6 other cases where people have emailed me
or our team directly with issues on ARM resolved by disabling WC.
Generally the driver seems to load ok, but then hangs as soon as you
try and use acceleration from userspace or we end up with page
flipping timeouts.  Not really sure what the issue is.  Michel
suggested maybe ARM has a cacheable kernel mapping of all "normal"
system memory, and having
both that mapping and another non-cacheable mapping of the same page
can result in bad behaviour.

>
> As for aliasing, yeah, having multiple aliases to the same piece of
> memory is a bad thing. The problem arises when devices on the PCI bus
> have memory allocated as device memory (which on Arm is non-cacheable
> and non-reorderable), but the PCI bus effectively acts as a write-combiner
> which changes the order of transactions. Therefore, for devices that
> have local memory associated with them (i.e. more than just register
> accesses) one should allocate memory in the first place that is
> Device-GRE (gathering, reordering and early-access). Otherwise, problems
> will surface that are not visible on x86 as that is a strongly ordered
> architecture.

PCI framebuffer BARs are mapped on the CPU with WC.  We also use
uncached WC mappings for system memory in cases where it's not likely
we will be doing any CPU reads.  When accessing system memory, the GPU
can either do a CPU cache snooped transaction or a non-snooped
transaction.  The non-snooped transaction has lower latency and better
throughput since it doesn't have to snoop the CPU cache.

>
> >
> > > Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
> >
> > Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx>
>
> Given that this API is only used by AMD I'm OK for now with the change,
> but I think in general it is misleading and we should work towards
> fixing radeon and amd drivers.

Alternatively, we could just disable WC in the amdgpu driver on ARM.
I'm not sure to what extent other drivers are using WC in general or
have been tested on ARM.

Alex

>
> Best regards,
> Liviu
>
> >
> > > ---
> > >  include/drm/drm_cache.h | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h
> > > index bfe1639df02d..691b4c4b0587 100644
> > > --- a/include/drm/drm_cache.h
> > > +++ b/include/drm/drm_cache.h
> > > @@ -47,6 +47,8 @@ static inline bool drm_arch_can_wc_memory(void)
> > >     return false;
> > >  #elif defined(CONFIG_MIPS) && defined(CONFIG_CPU_LOONGSON3)
> > >     return false;
> > > +#elif defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> > > +   return false;
> > >  #else
> > >     return true;
> > >  #endif
> > > --
> > > 2.13.6
> > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@xxxxxxxxxxxxxxxxxxxxx
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@xxxxxxxxxxxxxxxxxxxxx
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> ====================
> | I would like to |
> | fix the world,  |
> | but they're not |
> | giving me the   |
>  \ source code!  /
>   ---------------
>     ¯\_(ツ)_/¯
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux