Re: [PATCH] drm/doc: device hot-unplug for userspace

Andrey Grodzovsky <Andrey.Grodzovsky@xxxxxxx> · Thu, 28 May 2020 17:28:29 -0400

On 5/27/20 4:25 PM, Daniel Vetter wrote:
On Wed, May 27, 2020 at 9:44 PM Christian König
<christian.koenig@xxxxxxx> wrote:
Am 27.05.20 um 17:23 schrieb Andrey Grodzovsky:
On 5/27/20 10:39 AM, Daniel Vetter wrote:
On Wed, May 27, 2020 at 3:51 PM Andrey Grodzovsky
<Andrey.Grodzovsky@xxxxxxx> wrote:
On 5/27/20 2:44 AM, Pekka Paalanen wrote:
On Tue, 26 May 2020 10:30:20 -0400
Andrey Grodzovsky <Andrey.Grodzovsky@xxxxxxx> wrote:

On 5/19/20 6:06 AM, Pekka Paalanen wrote:
From: Pekka Paalanen <pekka.paalanen@xxxxxxxxxxxxx>

Set up the expectations on how hot-unplugging a DRM device should
look like to
userspace.

Written by Daniel Vetter's request and largely based on his
comments in IRC and
from
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Farchives%2Fdri-devel%2F2020-May%2F265484.html&amp;data=02%7C01%7CAndrey.Grodzovsky%40amd.com%7Cd1aab2c6fe71407a287708d8027c0f3c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637262079143242123&amp;sdata=krqBSHMfzl%2F4TMaAgEPDq8Y%2BPYWJATZyeDPfhtWQmeg%3D&amp;reserved=0
.

Signed-off-by: Pekka Paalanen <pekka.paalanen@xxxxxxxxxxxxx>
Cc: Daniel Vetter <daniel@xxxxxxxx>
Cc: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx>
Cc: Dave Airlie <airlied@xxxxxxxxxx>
Cc: Sean Paul <sean@xxxxxxxxxx>

---

Disclaimer: I am a userspace developer writing for other
userspace developers.
I took some liberties in defining what should happen without
knowing what is
actually possible or what existing drivers already implement.
---
     Documentation/gpu/drm-uapi.rst | 75
++++++++++++++++++++++++++++++++++
     1 file changed, 75 insertions(+)

diff --git a/Documentation/gpu/drm-uapi.rst
b/Documentation/gpu/drm-uapi.rst
index 56fec6ed1ad8..80db4abd2cbd 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -1,3 +1,5 @@
+.. Copyright 2020 DisplayLink (UK) Ltd.
+
     ===================
     Userland interfaces
     ===================
@@ -162,6 +164,79 @@ other hand, a driver requires shared state
between clients which is
     visible to user-space and accessible beyond open-file
boundaries, they
     cannot support render nodes.

+Device Hot-Unplug
+=================
+
+.. note::
+   The following is the plan. Implementation is not there yet
+   (2020 May 13).
+
+Graphics devices (display and/or render) may be connected via
USB (e.g.
+display adapters or docking stations) or Thunderbolt (e.g.
eGPU). An end
+user is able to hot-unplug this kind of devices while they are
being
+used, and expects that the very least the machine does not
crash. Any
+damage from hot-unplugging a DRM device needs to be limited as
much as
+possible and userspace must be given the chance to handle it if
it wants
+to. Ideally, unplugging a DRM device still lets a desktop to
continue
+running, but that is going to need explicit support throughout
the whole
+graphics stack: from kernel and userspace drivers, through display
+servers, via window system protocols, and in applications and
libraries.
+
+Other scenarios that should lead to the same are: unrecoverable GPU
+crash, PCI device disappearing off the bus, or forced unbind of
a driver
+from the physical device.
+
+In other words, from userspace perspective everything needs to
keep on
+working more or less, until userspace stops using the
disappeared DRM
+device and closes it completely. Userspace will learn of the device
+disappearance from the device removed uevent or in some cases
specific
+ioctls returning EIO.
+
+This goal raises at least the following requirements for the
kernel and
+drivers:
+
+- The kernel must not hang, crash or oops, no matter what
userspace was
+  in the middle of doing when the device disappeared.
+
+- All GPU jobs that can no longer run must have their fences
+  force-signalled to avoid inflicting hangs to userspace.
+
+- KMS connectors must change their status to disconnected.
+
+- Legacy modesets and pageflips fake success.
+
+- Atomic commits, both real and TEST_ONLY, fake success.
+
+- Pending non-blocking KMS operations deliver the DRM events
userspace
+  is expecting.
+
+- If underlying memory disappears, the mmaps are replaced with
harmless
+  zero pages where access does not raise SIGBUS. Reads return
zeros,
+  writes are ignored.
Regarding this paragraph - what about exiting mappings ? In the first
patchset we would actively invalidate all the existing CPU
mappings to
device memory and i think we still should do it otherwise we will see
random crashes in applications as was before. I guess it's because
TLBs
and page tables are not updated to reflect the fact the device is
gone.
Hi,

I was talking about existing mappings. What I forgot to specify is how
new mmap() calls after the device disappearance should work (the end
result should be the same still, not failure).

I'll clarify this in the next revision.


Thanks,
pq
I see, that ok.

Next related question is more for Daniel/Christian - about the
implementation of this paragraph, I was thinking about something like
checking for device disconnect in ttm_bo_vm_fault_reserved and if so
remap the entire VA range for the VMA where the fault address
belongs to
the global zero page (i.e. (remap_pfn_range(vma, vma->vm_start,
page_to_pfn(ZERO_PAGE(vma->vm_start), vma->vm_end - vma->vm_start,
vma->vm_page_prot)). Question is, when the doc says 'writes are
ignored'
does it mean i should use copy on write for the vma->vm_page_prot
and if
so how i actually do it as i was not able to find what flags to set
into
vm_page_prot to force copy on write behavior.
Already discussed this with Pekka on irc, I think simply a private
page (per gpu ctx to avoid leaks) is good enough. Otherwise we need to
catch write faults and throw the writes away, and that's a) a bit
tricky to implement and b) slow, which we kinda don't want to. If the
desktop is stuck for a few seconds because we're trapping every write
of a 4k buffer that's getting uploaded, the user is going to have a
bad time :-/
-Daniel

So like allocating a page per process context in the driver (struct
amdgpu_ctx in amdgpu) and mapping this page into the faulting VMAs
for when device is disconnected ? I am still not clear how i make the
mapping ignore writes without catching write faults and ignoring them.
I cannot just make it read only obviously and i can't make it writable
as then reading back will start returning non 0's. My question is what
set of flags in vm_area_struct.vm_flags can (if at all) give me
'ignore writes' behavior for the mapping of that page.
I'm not aware of a possibility like that on x86 CPUs. As far as I know
we only have something like an ignore write functionality on our GPUs
for PRTs.

Could we use an address which points to a non allocated MMIO space or
something like this? We would might get 0xffffffff on reads instead of
0x0, but writes would be certainly ignored.
I think just a page with garbage in, garbage out semantics is going to
be ok. I think pretty much anything has a chance to upset userspace,
so whether it's 0 or all 1s or anything else doesn't really matter.

Only thing that does matter a bit is that we have a page per fd, so
that we don't accidentally leak something between processes where we
shouldn't. I think as long as we don't crash&burn in a SIGBUS it's
good enough.
-Daniel


To use non allocated MMIO space i would need first to know which range 
is currently not used (how ?) and then reserve it (and free later) to 
avoid other devices start using it. I think the interface for this is 
https://elixir.bootlin.com/linux/v5.7-rc7/source/include/linux/ioport.h#L233. 
But still i like the zero page approach more where we map the zero page 
during new page faults into the faulting process page table with setting 
adding ~(VM_SHARED | VM_MAYSHARE) to vma->vm_flags or at least to 
pgprot_t for this particular maping which i think makes the mapping copy 
on write and so each process (each FD) will also not leak data to other 
processes.

Andrey



Christian.

Andrey


Andrey




+
+- dmabuf which point to memory that has disappeared are
rewritten to
+  point to harmless zero pages, similar to mmaps. Imports still
succeed
+  both ways: an existing device importing a dmabuf pointing to
+  disappeared memory, and a disappeared device importing any
dmabuf.
+
+- Render ioctls return EIO which is then handled in userspace
drivers,
+  e.g. Mesa, to have the device disappearance handled in the way
+  specified for each API (OpenGL, GL ES: GL_KHR_robustness;
+  Vulkan: VK_ERROR_DEVICE_LOST; etc.)
+
+Raising SIGBUS is not an option, because userspace cannot
realistically
+handle it.  Signal handlers are global, which makes them extremely
+difficult to use correctly from libraries like Mesa produces.
Signal
+handlers are not composable, you can't have different handlers
for GPU1
+and GPU2 from different vendors, and a third handler for mmapped
regular
+files.  Threads cause additional pain with signal handling as well.
+
+Only after userspace has closed all relevant DRM device and
dmabuf file
+descriptors and removed all mmaps, the DRM driver can tear down its
+instance for the device that no longer exists. If the same physical
+device somehow comes back in the mean time, it shall be a new DRM
+device.
+
     .. _drm_driver_ioctl:

     IOCTL Support on Device Nodes


_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel