On 2021-04-09 12:39 p.m., Christian König wrote:
Am 09.04.21 um 17:42 schrieb Andrey Grodzovsky:
On 2021-04-09 3:01 a.m., Christian König wrote:
Am 09.04.21 um 08:53 schrieb Christian König:
Am 08.04.21 um 22:39 schrieb Andrey Grodzovsky:
[SNIP]
But inserting dmr_dev_enter/exit on the highest level in drm_ioctl
is much less effort and less room for error then going through
each IOCTL and trying to identify at what point (possibly multiple
points) they are about to access HW, some of this is hidden deep
in HAL layers such as DC layer in display driver or the multi
layers of powerplay/SMU libraries. Also, we can't only limit
our-self to back-end if by this you mean ASIC specific functions
which access registers. We also need to take care of any MMIO
kernel BO (VRAM BOs) where we may access directly MMIO space by
pointer from the front end of the driver (HW agnostic) and TTM/DRM
layers.
Exactly, yes. The key point is we need to identify such places
anyway for GPU reset to work properly. So we could just piggy back
hotplug on top of that work and are done.
I see most of this was done By Denis in this patch
https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=drm-misc-next&id=df9c8d1aa278c435c30a69b8f2418b4a52fcb929,
indeed this doesn't cover the direct by pointer accesses of MMIO and
will introduce much more of those and, as people write new code, new
places to cover will pop up leading to regressions and extra work to
fix. It would be really much better if we could blanket cover it at
the very top such as root of all IOCTLs or, for any queued
work/timer at the very top function, to handle it once and for all.
And exactly that's what is not possible. At least for the reset case
you need to look into each hardware access and handle that bit by bit
and I think that for the hotplug case we should go down that route as
well.
Our problem here is how to signal all the existing fences on one
hand and on the other prevent any new dma_fence waits after we
finished signaling existing fences. Once we solved this then there
is no problem using drm_dev_unplug in conjunction with
drm_dev_enter/exit at the highest level of drm_ioctl to flush any
IOCTLs in flight and block any new ones.
IMHO when we speak about signalling all fences we don't mean ALL
the currently existing dma_fence structs (they are spread all over
the place) but rather signal all the HW fences because HW is
what's gone and we can't expect for those fences to be ever
signaled. All the rest such as: scheduler fences, user fences,
drm_gem reservation objects e.t.c. are either dependent on those
HW fences and hence signaling the HW fences will in turn signal
them or, are not impacted by the HW being gone and hence can still
be waited on and will complete. If this assumption is correct then
I think that we should use some flag to prevent any new submission
to HW which creates HW fences (somewhere around
amdgpu_fence_emit), then traverse all existing HW fences
(currently they are spread in a few places so maybe we need to
track them in a list) and signal them. After that it's safe to cal
drm_dev_unplug and be sure synchronize_srcu won't stall because of
of dma_fence_wait. After that we can proceed to canceling work
items, stopping schedulers e.t.c.
That is problematic as well since you need to make sure that the
scheduler is not creating a new hardware fence in the moment you
try to signal all of them. It would require another SRCU or lock
for this.
If we use a list and a flag called 'emit_allowed' under a lock such
that in amdgpu_fence_emit we lock the list, check the flag and if
true add the new HW fence to list and proceed to HW emition as
normal, otherwise return with -ENODEV. In amdgpu_pci_remove we take
the lock, set the flag to false, and then iterate the list and force
signal it. Will this not prevent any new HW fence creation from now
on from any place trying to do so ?
Way to much overhead. The fence processing is intentionally lock free
to avoid cache line bouncing because the IRQ can move from CPU to CPU.
We need something which at least the processing of fences in the
interrupt handler doesn't affect at all.
As far as I see in the code, amdgpu_fence_emit is only called from task
context. Also, we can skip this list I proposed and just use
amdgpu_fence_driver_force_completion for each ring to signal all created
HW fences.
Alternatively grabbing the reset write side and stopping and then
restarting the scheduler could work as well.
Christian.
I didn't get the above and I don't see why I need to reuse the GPU
reset rw_lock. I rely on the SRCU unplug flag for unplug. Also, not
clear to me why are we focusing on the scheduler threads, any code
patch to generate HW fences should be covered, so any code leading to
amdgpu_fence_emit needs to be taken into account such as, direct IB
submissions, VM flushes e.t.c
You need to work together with the reset lock anyway, cause a hotplug
could run at the same time as a reset.
For going my way indeed now I see now that I have to take reset write
side lock during HW fences signalling in order to protect against
scheduler/HW fences detachment and reattachment during schedulers
stop/restart. But if we go with your approach then calling
drm_dev_unplug and scoping amdgpu_job_timeout with drm_dev_enter/exit
should be enough to prevent any concurrent GPU resets during unplug. In
fact I already do it anyway -
https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=drm-misc-next&id=ef0ea4dd29ef44d2649c5eda16c8f4869acc36b1
Andrey
Christian.
Andrey
Christian.
Andrey
Andrey
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx