On Mon, Sep 23, 2024 at 11:38:55AM +0300, Andy Shevchenko wrote: > On Mon, Sep 23, 2024 at 09:28:23AM +0530, Raag Jadav wrote: > > Introduce device wedged event, which will notify userspace of wedged > > (hanged/unusable) state of the DRM device through a uevent. This is > > useful especially in cases where the device is no longer operating as > > expected and has become unrecoverable from driver context. > > > > Purpose of this implementation is to provide drivers a way to recover > > through userspace intervention. Different drivers may have different > > ideas of a "wedged device" depending on their hardware implementation, > > and hence the vendor agnostic nature of the event. It is up to the drivers > > to decide when they see the need for recovery and how they want to recover > > from the available methods. > > > > Current implementation defines three recovery methods, out of which, > > drivers can choose to support any one or multiple of them. Preferred > > recovery method will be sent in the uevent environment as WEDGED=<method>. > > Userspace consumers (sysadmin) can define udev rules to parse this event > > and take respective action to recover the device. > > > > Method | Consumer expectations > > -----------|----------------------------------- > > rebind | unbind + rebind driver > > bus-reset | unbind + reset bus device + rebind > > reboot | reboot system > > > v4: s/drm_dev_wedged/drm_dev_wedged_event > > Use drm_info() (Jani) > > Kernel doc adjustment (Aravind) > > v5: Send recovery method with uevent (Lina) > > v6: Access wedge_recovery_opts[] using helper function (Jani) > > Use snprintf() (Jani) > > Hmm... Isn't changelog in the cover letter is not enough? Which was initial thought but I'm told otherwise ¯\_(ツ)_/¯ > ... > > > +extern const char *const wedge_recovery_opts[]; > > It's not NULL terminated. How users will know that they have an index valid? It's expected to be accessed using recovery_*() helpers. > Either you NULL-terminate that, or export the size as well (personally I would > go with the first approach). > > ... > > > +static inline bool recovery_method_is_valid(enum wedge_recovery_method method) > > +{ > > + if (method >= DRM_WEDGE_RECOVERY_REBIND && method < DRM_WEDGE_RECOVERY_MAX) > > + return true; > > + > > + return false; > > Besides that this can be written as > > return method >= DRM_WEDGE_RECOVERY_REBIND && method < DRM_WEDGE_RECOVERY_MAX; > > > +} > > this seems a runtime approach for what we have at compile-time, i.e. static_assert() My understanding is that we have runtime users that the compiler may not be able to resolve. Raag