Hi, I've been working recently on adding support on vc4 for interrupt-based hotplug detection for the HDMI controller, uncovering a bunch of bugs in the process, and interestingly creating a hard CPU hang. The HDMI controller CRTC is shared with a TV connector that doesn't have any hotplug detection and returns an unknown status all the time. After digging into it, the sequence that leads to the CPU hang is: - The HDMI HPD interrupt is triggered and the connector is now detected as disabled. - A new commit is done that will disable the HDMI controller and enable the TV encoder. - At the end of that commit, the fbdev DRM client will run drm_fbdev_client_hotplug[1], and then call drm_fb_helper_hotplug_event[2]. In that function, drm_client_modeset_probe[3] is then called. Once all the proper arrays have been allocated, it will call .fill_modes on all the connectors, which in both connectors case is drm_helper_probe_single_connector_modes[4]. - Here, we go down the drm_helper_probe_detect[5] path, and will end up calling our connector detect hook. Now, this is fairly simple in the TV connector case, where we always return connector_status_unknown. However, the HDMI case is a bit more complicated: https://elixir.bootlin.com/linux/v5.12/source/drivers/gpu/drm/vc4/vc4_hdmi.c#L157 We basically first look at a HPD GPIO if we have one, then we try to probe the display through DDC, and finally we try to access the hotplug detection register in the HDMI controller. In the failing case, we were first going through the DDC probe, and it turns out it's failing from time to time for some reason. It looks to be timing related since a bunch of printk are making it work, but I'm not quite sure why at this point. Anyway, we were then accessing the HDMI_HOTPLUG register to read the status. Except that we completely powered down the HDMI controller, its clocks and its power domain as part of the atomic_disable mentioned in the first bullet point, which then completely stalls the CPU. I guess my question is two-fold: does the above flow makes sense from a DRM standpoint (it looks so to me), and what are the expectations for detect in terms of controller power status? The doc doesn't seem to mention anything about it. Thanks! Maxime 1: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_fb_helper.c#L2396 2: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_fb_helper.c#L1948 3: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_client_modeset.c#L765 4: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_probe_helper.c#L416 5: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_probe_helper.c#L328
Attachment:
signature.asc
Description: PGP signature