On Mon, Sep 09, 2024 at 03:01:50PM -0500, Lucas De Marchi wrote: > On Sun, Sep 08, 2024 at 11:08:39PM GMT, Asahi Lina wrote: > > On 9/8/24 12:07 AM, Lucas De Marchi wrote: > > > On Sat, Sep 07, 2024 at 08:38:30PM GMT, Asahi Lina wrote: > > > > On 9/6/24 6:42 PM, Raag Jadav wrote: > > > > > Introduce device wedged event, which will notify userspace of wedged > > > > > (hanged/unusable) state of the DRM device through a uevent. This is > > > > > useful especially in cases where the device is in unrecoverable state > > > > > and requires userspace intervention for recovery. > > > > > > > > > > Purpose of this implementation is to be vendor agnostic. Userspace > > > > > consumers (sysadmin) can define udev rules to parse this event and > > > > > take respective action to recover the device. > > > > > > > > > > Consumer expectations: > > > > > ---------------------- > > > > > 1) Unbind driver > > > > > 2) Reset bus device > > > > > 3) Re-bind driver > > > > > > > > Is this supposed to be normative? For drm/asahi we have a "wedged" > > > > concept (firmware crashed), but the only possible recovery action is a > > > > full system reboot (which might still be desirable to allow userspace to > > > > trigger automatically in some scenarios) since there is no bus-level > > > > reset and no firmware reload possible. > > > > > > maybe let drivers hint possible/supported recovery mechanisms and then > > > sysadmin chooses what to do? > > > > How would we do this? A textual value for the event or something like > > that? ("WEDGED=bus-reset" vs "WEDGED=reboot"?) > > If there's a need for more than one, than I think exposing the supported > ones sorted by "side effect" in sysfs would be good. Something like: > > $ cat /sys/class/drm/card0/device/wedge_recover > rebind > bus-reset > reboot How do we expect the drivers to flag supported ones? Extra hooks? Raag