On Tue, Mar 12, 2024 at 10:39:46AM -0700, Abhinav Kumar wrote: > On 3/12/2024 9:59 AM, Johan Hovold wrote: > >> Heh. This is getting ridiculous. I just tried running with this patch > >> and it again breaks hotplug detect in a VT console and in X (where I > >> could enable a reconnected external display by running xrandr twice > >> before). > >> > >> So, please, do not apply this one. > > > > To make things worse, I indeed also hit the reset when disconnecting > > after such a failed hotplug. > Ack, I will hold off till I analyze your issues more which you have > listed in separate replies. Especially about the spurious connect, I > believe you are trying to mention that, by adding logs, you are able to > delay the processing of a connect event to *make* it like a spurious > one? In case, I got this part wrong, can you pls explain the spurious > connect scenario again? No, I only mentioned the debug printks in passing as instrumentation like that may affect race conditions (but I'm also hitting the resets also with no printks in place). The spurious connect event comes directly from the pmic firmware, and even if we may optimise things by implementing some kind of debounce, the hotplug implementation needs to be robust enough to not kill the machine if such an event gets through. Basically what I see is that during physical disconnect there can be multiple hpd notify events (e.g. connect, disconnect, connect): [ 146.910195] usb 5-1: USB disconnect, device number 4 [ 146.931026] msm-dp-display ae98000.displayport-controller: dp_bridge_hpd_notify - link_ready = 1, status = 2 [ 146.934785] msm-dp-display ae98000.displayport-controller: dp_hpd_unplug_handle [ 146.938114] msm-dp-display ae98000.displayport-controller: dp_bridge_hpd_notify - link_ready = 1, status = 1 [ 146.940245] [CONNECTOR:35:DP-2] status updated from disconnected to connected [ 146.955193] msm-dp-display ae98000.displayport-controller: dp_bridge_hpd_notify - link_ready = 0, status = 2 And it is the spurious connect event while the link is being tore down that triggers the hotplug processing that leads to the reset. Similarly, I've seen spurious disconnect events while the plug in being inserted. > A short response on why this change was made is that commit can be > issued by userspace or the fbdev client. So userspace involvement only > makes commit happen from a different path. It would be incorrect to > assume the issues from the earlier bug and the current one are different > only because there was userspace involvement in that one and not this. > > Because in the end, it manifests itself in the same way that > atomic_enable() did not go through after an atomic_disable() and the > next atomic_disable() crashes. Right, but your proposed fix would not actually fix anything and judging from the sparse commit message and diff itself it is clearly only meant to mitigate the case where user space is involved, which is *not* the case here. Johan