Quoting Douglas Anderson (2022-05-31 16:01:26) > In commit a670ff578f1f ("drm/msm/dpu: always use mdp device to scale > bandwidth") we fully moved interconnect stuff to the DPU driver. This > had no change for sc7180 but _did_ have an impact for other SoCs. It > made them match the sc7180 scheme. > > Unfortunately, the sc7180 scheme seems like it was a bit broken. > Specifically the interconnect needs to be on for more than just the > DPU driver's AXI bus. In the very least it also needs to be on for the > DSI driver's AXI bus. This can be seen fairly easily by doing this on > a ChromeOS sc7180-trogdor class device: > > set_power_policy --ac_screen_dim_delay=5 --ac_screen_off_delay=10 > sleep 10 > cd /sys/bus/platform/devices/ae94000.dsi/power > echo on > control > > When you do that, you'll get a warning splat in the logs about > "gcc_disp_hf_axi_clk status stuck at 'off'". > > One could argue that perhaps what I have done above is "illegal" and > that it can't happen naturally in the system because in normal system > usage the DPU is pretty much always on when DSI is on. That being > said: > * In official ChromeOS builds (admittedly a 5.4 kernel with backports) > we have seen that splat at bootup. > * Even though we don't use "autosuspend" for these components, we > don't use the "put_sync" variants. Thus plausibly the DSI could stay > "runtime enabled" past when the DPU is enabled. Techncially we > shouldn't do that if the DPU's suspend ends up yanking our clock. > > Let's change things such that the "bare minimum" request for the > interconnect happens in the mdss driver again. That means that all of > the children can assume that the interconnect is on at the minimum > bandwidth. We'll then let the DPU request the higher amount that it > wants. > > It should be noted that this isn't as hacky of a solution as it might > initially appear. Specifically: > * Since MDSS and DPU individually get their own references to the > interconnect then the framework will actually handle aggregating > them. The two drivers are _not_ clobbering each other. > * When the Qualcomm interconnect driver aggregates it takes the max of > all the peaks. Thus having MDSS request a peak, as we're doing here, > won't actually change the total interconnect bandwidth (it won't be > added to the request for the DPU). This perhaps explains why the > "average" requested in MDSS was historically 0 since that one > _would_ be added in. > > NOTE also that in the downstream ChromeOS 5.4 and 5.15 kernels, we're > also seeing some RPMH hangs that are addressed by this fix. These > hangs are showing up in the field and on _some_ devices with enough > stress testing of suspend/resume. Specifically right at suspend time > with a stack crawl that looks like this (from chromeos-5.15 tree): > rpmh_write_batch+0x19c/0x240 > qcom_icc_bcm_voter_commit+0x210/0x420 > qcom_icc_set+0x28/0x38 > apply_constraints+0x70/0xa4 > icc_set_bw+0x150/0x24c > dpu_runtime_resume+0x50/0x1c4 > pm_generic_runtime_resume+0x30/0x44 > __genpd_runtime_resume+0x68/0x7c > genpd_runtime_resume+0x12c/0x20c > __rpm_callback+0x98/0x138 > rpm_callback+0x30/0x88 > rpm_resume+0x370/0x4a0 > __pm_runtime_resume+0x80/0xb0 > dpu_kms_enable_commit+0x24/0x30 > msm_atomic_commit_tail+0x12c/0x630 > commit_tail+0xac/0x150 > drm_atomic_helper_commit+0x114/0x11c > drm_atomic_commit+0x68/0x78 > drm_atomic_helper_disable_all+0x158/0x1c8 > drm_atomic_helper_suspend+0xc0/0x1c0 > drm_mode_config_helper_suspend+0x2c/0x60 > msm_pm_prepare+0x2c/0x40 > pm_generic_prepare+0x30/0x44 > genpd_prepare+0x80/0xd0 > device_prepare+0x78/0x17c > dpm_prepare+0xb0/0x384 > dpm_suspend_start+0x34/0xc0 > > We don't completely understand all the mechanisms in play, but the > hang seemed to come and go with random factors. It's not terribly > surprising that the hang is gone after this patch since the line of > code that was failing is no longer present in the kernel. > > Fixes: a670ff578f1f ("drm/msm/dpu: always use mdp device to scale bandwidth") > Fixes: c33b7c0389e1 ("drm/msm/dpu: add support for clk and bw scaling for display") > Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx> > --- Reviewed-by: Stephen Boyd <swboyd@xxxxxxxxxxxx>