[PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS

daniel@xxxxxxxx (Daniel Vetter) · Fri, 30 Jun 2017 14:39:04 +0200

On Fri, Jun 30, 2017 at 08:47:27AM +0200, Christian KÃ¶nig wrote:
> Am 30.06.2017 um 04:24 schrieb Michel DÃ¤nzer:
> > On 29/06/17 07:05 PM, Daniel Vetter wrote:
> > > On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel DÃ¤nzer wrote:
> > > > On 29/06/17 05:23 PM, Christian KÃ¶nig wrote:
> > > > > Am 29.06.2017 um 04:35 schrieb Michel DÃ¤nzer:
> > > > > > On 29/06/17 08:26 AM, John Brooks wrote:
> > > > > > > On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian KÃ¶nig wrote:
> > > > > > > > > Instead of the flag being set in stone at BO creation, set the flag
> > > > > > > > > when a
> > > > > > > > > page fault occurs so that it goes somewhere CPU-visible, and clear
> > > > > > > > > it when
> > > > > > > > > the BO is requested by the GPU.
> > > > > > > > > 
> > > > > > > > > However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
> > > > > > > > > invisible VRAM, where they may promptly generate another page
> > > > > > > > > fault. When
> > > > > > > > > BOs are constantly moved back and forth like this, it is highly
> > > > > > > > > detrimental
> > > > > > > > > to performance. Only clear the flag on CS if:
> > > > > > > > > 
> > > > > > > > > - The BO wasn't page faulted for a certain amount of time
> > > > > > > > > (currently 10
> > > > > > > > >     seconds), and
> > > > > > > > > - its last page fault didn't occur too soon (currently 500ms) after
> > > > > > > > > its
> > > > > > > > >     last CS request, or vice versa.
> > > > > > > > > 
> > > > > > > > > Setting the flag in amdgpu_fault_reserve_notify() also means that
> > > > > > > > > we can
> > > > > > > > > remove the loop to restrict lpfn to the end of visible VRAM, because
> > > > > > > > > amdgpu_ttm_placement_init() will do it for us.
> > > > > > > > I'm fine with the general approach, but I'm still absolutely not
> > > > > > > > keen about
> > > > > > > > clearing the flag when userspace has originally specified it.
> > > > > > Is there any specific concern you have about that?
> > > > > Yeah, quite a bunch actually. We want to use this flag for P2P buffer
> > > > > sharing in the future as well and I don't intent to add another one like
> > > > > CPU_ACCESS_REALLY_REQUIRED or something like this.
> > > > Won't a BO need to be pinned while it's being shared with another device?
> > > That's an artifact of the current kernel implementation, I think we could
> > > do better (but for current use-cases where we share a bunch of scanouts
> > > and maybe a few pixmaps it's pointless). I wouldn't bet uapi on this never
> > > changing.
> > Surely there will need to be some kind of transaction though to let the
> > driver know when a BO starts/stops being shared with another device?
> > Either via the existing dma-buf callbacks, or something similar. We
> > can't rely on userspace setting a "CPU access" flag to make sure a BO
> > can be shared with other devices?

Well I just jumped into the middle of this that it's not entirely out of
the question as an idea, but yeah we'd need to rework the dma-buf stuff
with probably a callback to evict mappings/stall for outstanding rendering
or something like that.

> Well, the flag was never intended to be used by userspace.
> 
> See the history was more like we need something in the kernel to place the
> BO in CPU accessible VRAM.
> 
> Then the closed source UMD came along and said hey we have the concept of
> two different heaps for visible and invisible VRAM, how does that maps to
> amdgpu?
> 
> I unfortunately was to tired to push back hard enough on this....

Ehrm, are you saying you have uapi for the closed source stack only?

I can help with the push back on this with a revert, no problem :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch