Re: Question about supporting AMD eGPU hot plug case

Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> · Tue, 16 Mar 2021 13:39:35 -0400

On 2021-03-16 12:17 p.m., Christian König wrote:
Am 15.03.21 um 18:11 schrieb Andrey Grodzovsky:

On 2021-03-15 12:55 p.m., Christian König wrote:

Am 15.03.21 um 17:21 schrieb Andrey Grodzovsky:

On 2021-03-15 12:10 p.m., Christian König wrote:
Am 12.03.21 um 16:34 schrieb Andrey Grodzovsky:

On 2021-03-12 4:03 a.m., Christian König wrote:
Am 11.03.21 um 23:40 schrieb Andrey Grodzovsky:
[SNIP]
The expected result is they all move closer to the start of 
PCI address
space.

Ok, I updated as you described. Also I removed PCI conf 
command to stop
address decoding and restart later as I noticed PCI core does 
it itself
when needed.
I tested now also with graphic desktop enabled while submitting
3d draw commands and seems like under this scenario everything 
still
works. Again, this all needs to be tested with VRAM BAR move 
as then
I believe I will see more issues like handling of MMIO mapped 
VRAM objects (like GART table). In case you do have an AMD 
card you could also maybe give it a try. In the meanwhile I 
will add support to ioremapping of those VRAM objects.

Andrey

Just an update, added support for unmaping/remapping of all VRAM
objects, both user space mmaped and kernel ioremaped. Seems to 
work
ok but again, without forcing VRAM BAR to move I can't be sure.
Alex, Chsristian - take a look when you have some time to give 
me some
initial feedback on the amdgpu side.

The code is at 
https://cgit.freedesktop.org/~agrodzov/linux/log/?h=yadro%2Fpcie_hotplug%2Fmovable_bars_v9.1 

Mhm, that let's userspace busy retry until the BAR movement is 
done.

Not sure if that can't live lock somehow.

Christian.

In my testing it didn't but, I can instead route them to some
global static dummy page while BARs are moving and then when 
everything
done just invalidate the device address space again and let the
pagefaults fill in valid PFNs again.

Well that won't work because the reads/writes which are done in 
the meantime do need to wait for the BAR to be available again.

So waiting for the BAR move to finish is correct, but what we 
should do is to use a lock instead of an SRCU because that makes 
lockdep complain when we do something nasty.

Christian.

Spinlock I assume ? We can't sleep there - it's an interrupt.

Mhm, the BAR movement is in interrupt context?

No, BARs move is in task context I believe. The page faults are in 
interrupt context and so we can only lock a spinlock there I assume,

No, page faults are in task context as well! Otherwise you wouldn't be 
able to sleep for network I/O in a page fault for example.

Ok, that was a long standing confusion on my side, especially because 
'Understanding the Linux Kernel' states that do_page_fault is an 
interrupt handler here - 
https://vistech.net/~champ/online-docs/books/linuxkernel2/060.htm
while in fact this is an exception handler which is ran in the context 
of the user process causing it and hence can sleep ( as explained here 
by Rober Love himself) https://www.spinics.net/lists/newbies/msg07287.html

not a mutex which might sleep. But we can't lock
spinlock for the entire BAR move because HW suspend + asic reset is a 
long process with some sleeps/context switches inside it probably.

Well that is rather bad. I was hoping to rename the GPU reset rw_sem 
into device_access rw_sem and then use the same lock for both (It's 
essentially the same problem).

I was thinking about it from day 1 but what looked to me different is 
that in GPU reset case there is no technical need to block MMIO 
accesses as the BARs are not moving
and so the page table entries remain valid. It's true that while the 
device in reset those MMIO accesses are meaninglessness - so this 
indeed could be good reason to block
access even during GPU reset.

From the experience now I would say that we should block MMIO access 
during GPU reset as necessary.

We can't do things like always taking the lock in each IOCTL, but for 
low level hardware access it shouldn't be a problem at all.

Christian.

I will update the code then to reuse  our adev->reset_sem for this locking.

Andrey

Andrey

But when we need to move the BAR in atomic/interrupt context that 
makes things a bit more complicated.

Christian.

Andrey

Andrey

Andrey