[PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS

deathsimple@xxxxxxxxxxx (Christian König) · Fri, 30 Jun 2017 14:46:56 +0200

Am 30.06.2017 um 14:39 schrieb Daniel Vetter:
> On Fri, Jun 30, 2017 at 08:47:27AM +0200, Christian KÃ¶nig wrote:
>> Am 30.06.2017 um 04:24 schrieb Michel DÃ¤nzer:
>>> On 29/06/17 07:05 PM, Daniel Vetter wrote:
>>>> On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel DÃ¤nzer wrote:
>>>>> On 29/06/17 05:23 PM, Christian KÃ¶nig wrote:
>>>>>> Am 29.06.2017 um 04:35 schrieb Michel DÃ¤nzer:
>>>>>>> On 29/06/17 08:26 AM, John Brooks wrote:
>>>>>>>> On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian KÃ¶nig wrote:
>>>>>>>>>> Instead of the flag being set in stone at BO creation, set the flag
>>>>>>>>>> when a
>>>>>>>>>> page fault occurs so that it goes somewhere CPU-visible, and clear
>>>>>>>>>> it when
>>>>>>>>>> the BO is requested by the GPU.
>>>>>>>>>>
>>>>>>>>>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
>>>>>>>>>> invisible VRAM, where they may promptly generate another page
>>>>>>>>>> fault. When
>>>>>>>>>> BOs are constantly moved back and forth like this, it is highly
>>>>>>>>>> detrimental
>>>>>>>>>> to performance. Only clear the flag on CS if:
>>>>>>>>>>
>>>>>>>>>> - The BO wasn't page faulted for a certain amount of time
>>>>>>>>>> (currently 10
>>>>>>>>>>      seconds), and
>>>>>>>>>> - its last page fault didn't occur too soon (currently 500ms) after
>>>>>>>>>> its
>>>>>>>>>>      last CS request, or vice versa.
>>>>>>>>>>
>>>>>>>>>> Setting the flag in amdgpu_fault_reserve_notify() also means that
>>>>>>>>>> we can
>>>>>>>>>> remove the loop to restrict lpfn to the end of visible VRAM, because
>>>>>>>>>> amdgpu_ttm_placement_init() will do it for us.
>>>>>>>>> I'm fine with the general approach, but I'm still absolutely not
>>>>>>>>> keen about
>>>>>>>>> clearing the flag when userspace has originally specified it.
>>>>>>> Is there any specific concern you have about that?
>>>>>> Yeah, quite a bunch actually. We want to use this flag for P2P buffer
>>>>>> sharing in the future as well and I don't intent to add another one like
>>>>>> CPU_ACCESS_REALLY_REQUIRED or something like this.
>>>>> Won't a BO need to be pinned while it's being shared with another device?
>>>> That's an artifact of the current kernel implementation, I think we could
>>>> do better (but for current use-cases where we share a bunch of scanouts
>>>> and maybe a few pixmaps it's pointless). I wouldn't bet uapi on this never
>>>> changing.
>>> Surely there will need to be some kind of transaction though to let the
>>> driver know when a BO starts/stops being shared with another device?
>>> Either via the existing dma-buf callbacks, or something similar. We
>>> can't rely on userspace setting a "CPU access" flag to make sure a BO
>>> can be shared with other devices?
> Well I just jumped into the middle of this that it's not entirely out of
> the question as an idea, but yeah we'd need to rework the dma-buf stuff
> with probably a callback to evict mappings/stall for outstanding rendering
> or something like that.
>
>> Well, the flag was never intended to be used by userspace.
>>
>> See the history was more like we need something in the kernel to place the
>> BO in CPU accessible VRAM.
>>
>> Then the closed source UMD came along and said hey we have the concept of
>> two different heaps for visible and invisible VRAM, how does that maps to
>> amdgpu?
>>
>> I unfortunately was to tired to push back hard enough on this....
> Ehrm, are you saying you have uapi for the closed source stack only?

No, Mesa is using that flag as well.

What I'm saying is that we have a flag which became uapi because I was 
to lazy to distinct between uapi and kernel internal flags.

> I can help with the push back on this with a revert, no problem :-)

That would break Mesa and is not an option (unfortunately :).

Christian.