Hi, I started looking into actual implementation for this, as
reference point I am using Sege's NVME implementations (see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FYADRO-KNS%2Flinux%2Fcommit%2F58c375df086502538706ee23e8ef7c8bb5a4178f&data=04%7C01%7Candrey.grodzovsky%40amd.com%7C44bc68a68feb4b71836c08d8dd9b306a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637503005898302093%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=nuTptAjGewG2T7tXOQr2h%2Bdr42lyg3m1%2Bfv0mJzd8tw%3D&reserved=0)
and our own amdgpu_device_resize_fb_bar. Our use case being more
complicated then NVME's arises some questions or thoughts:
Before the PCI core will move the BARs (VRAM and registers) we
have
to stop the HW and SW and then unmap all MMIO mapped entitles
in the
driver. This includes, registers of all types and VRAM placed BOs
with CPU visibility. My concern is how to prevent accesses to all
MMIO ranges between iounmap in rescan_preapare and ioremap in
rescan_done. For register accesses we are again back to same issue
we had with device unplug and GPU reset we are discussing with
Denis
now. So here at least as first solution just to confirm the
feature
works I can introduce low level (in register accessors) RW locking
to avoid access until all register access driver pointers are
remapped post BARs move. What more concerns me is how to - In
rescan_preapare: collect all CPU accessible BOs placed in VRAM (i
can use amdgpu specific list like I currently use for device
unplug), lock them to prevent any further use of them, unpin them
(and unmap). In rescan_preapare: remap them back and unlock
them for
further use. Also, there is a difference between kernel BOs where
indeed iounmap, ioremap should be enough and user space mapped BOs
where it seems like in hot unplug, we need in rescan_preapare to
only unamp them, in between drop new page faults with
VM_FAULT_RETRY
and post rescan_done allow page faults to go through to refill all
the user page tables with updated MMIO addresses.
Let me know your thoughts.
Andrey
On 2021-02-26 4:03 p.m., Andrey Grodzovsky wrote:
On 2021-02-26 1:44 a.m., Sergei Miroshnichenko wrote:
On Thu, 2021-02-25 at 13:28 -0500, Andrey Grodzovsky wrote:
On 2021-02-25 2:00 a.m., Sergei Miroshnichenko wrote:
On Wed, 2021-02-24 at 17:51 -0500, Andrey Grodzovsky wrote:
On 2021-02-24 1:23 p.m., Sergei Miroshnichenko wrote:
...
Are you saying that even without hot-plugging, while both
nvme
and
AMD
card are present
right from boot, you still get BARs moving and MMIO ranges
reassigned
for NVME BARs
just because amdgpu driver will start resize of AMD card
BARs and
this
will trigger NVMEs BARs move to
allow AMD card BARs to cover full range of VIDEO RAM ?
Yes. Unconditionally, because it is unknown beforehand if
NVMe's
BAR
movement will help. In this particular case BAR movement is
not
needed,
but is done anyway.
BARs are not moved one by one, but the kernel releases all the
releasable ones, and then recalculates a new BAR layout to
fit them
all. Kernel's algorithm is different from BIOS's, so NVME has
appeared
at a new place.
This is triggered by following:
- at boot, if BIOS had assigned not every BAR;
- during pci_resize_resource();
- during pci_rescan_bus() -- after a pciehp event or a
manual via
sysfs.
By manual via sysfs you mean something like this - 'echo 1 >
/sys/bus/pci/drivers/amdgpu/0000\:0c\:00.0/remove && echo 1 >
/sys/bus/pci/rescan ' ? I am looking into how most reliably
trigger
PCI
code to call my callbacks even without having external PCI
cage for
GPU
(will take me some time to get it).
Yeah, this is our way to go when a device can't be physically
removed
or unpowered remotely. With just a bit shorter path:
sudo sh -c 'echo 1 >
/sys/bus/pci/devices/0000\:0c\:00.0/remove'
sudo sh -c 'echo 1 > /sys/bus/pci/rescan'
Or, just a second command (rescan) is enough: a BAR movement
attempt
will be triggered even if there were no changes in PCI topology.
Serge
Ok, able to trigger stub callbacks call after doing rescan from
sysfs. Also I see BARs movement
[ 1860.968036] amdgpu 0000:09:00.0: BAR 0: assigned [mem
0xc0000000-0xcfffffff 64bit pref]
[ 1860.968050] amdgpu 0000:09:00.0: BAR 0 updated: 0xe0000000 ->
0xc0000000
[ 1860.968054] amdgpu 0000:09:00.0: BAR 2: assigned [mem
0xd0000000-0xd01fffff 64bit pref]
[ 1860.968068] amdgpu 0000:09:00.0: BAR 2 updated: 0xf0000000 ->
0xd0000000
[ 1860.968071] amdgpu 0000:09:00.0: BAR 5: assigned [mem
0xf0100000-0xf013ffff]
[ 1860.968078] amdgpu 0000:09:00.0: BAR 5 updated: 0xfce00000 ->
0xf0100000
As expected, after the move the device becomes unresponsive as
unmap/ioremap wasn't done.
Andrey