On Wed, Mar 16, 2022 at 7:12 AM Alex Deucher <alexdeucher@xxxxxxxxx> wrote: > > On Wed, Mar 16, 2022 at 4:48 AM Pekka Paalanen <ppaalanen@xxxxxxxxx> wrote: > > [snip] > > With new UAPI comes the demand of userspace proof, not hand-waving. You > > would not be proposing this new interface if you didn't have use cases > > in mind, even just one. You have to document what you imagine the new > > thing to be used for, so that the appropriateness can be evaluated. If > > the use case is deemed inappropriate for the proposed UAPI, you need to > > find another use case to justify adding the new UAPI. If there is no > > use for the UAPI, it shouldn't be added, right? Adding UAPI and hoping > > someone finds use for it seems backwards to me. > > We do have a use case. It's what I described originally. There is a > user space daemon (could be a compositor, could be something else) > that runs and listens for GPU reset notifications. When it receives a > notification, it takes action and kills the guilty app and restarts > the compositer and gathers any relevant data related to the GPU hang > (if possible). We can revisit this discussion once we have the whole > implementation complete. Other drivers seem to do similar things > already today via different means (msm using devcoredump, i915 seems > to have its own GPU reset notification mechanism, etc.). It just > seemed like there was value in having a generic drm GPU reset > notification, but maybe not yet. just one point of clarification.. in the msm and i915 case it is purely for debugging and telemetry (ie. sending crash logs back to distro for analysis if user has crash reporting enabled).. it isn't used for triggering any action like killing app or compositor. I would however *strongly* recommend devcoredump support in other GPU drivers (i915's thing pre-dates devcoredump by a lot).. I've used it to debug and fix a couple obscure issues that I was not able to reproduce by myself. BR, -R