On 23/11/2023 12:50, AngeloGioacchino Del Regno wrote: > Some SoCs may be equipped with a GPU containing two core groups > and this is exactly the case of Samsung's Exynos 5422 featuring > an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost > is partial, as this driver currently supports using only one > core group and that's reflected on all parts of it, including > the power on (and power off, previously to this patch) function. > > The issue with this is that even though executing the soft reset > operation should power off all cores unconditionally, on at least > one platform we're seeing a crash that seems to be happening due > to an interrupt firing which may be because we are calling power > transition only on the first core group, leaving the second one > unchanged, or because ISR execution was pending before entering > the panfrost_gpu_power_off() function and executed after powering > off the GPU cores, or all of the above. > > Finally, solve this by introducing a new panfrost_gpu_suspend_irq() > helper function and changing the panfrost_device_suspend() flow to > 1. Mask and clear all interrupts: we don't need nor want any, as > for power_off() we are polling PWRTRANS, but we anyway don't > want GPU IRQs to fire while it is suspended/powered off; > 2. Call synchronize_irq() after that to make sure that any pending > ISR is executed before powering off the GPU Shaders/Tilers/L2 > hence avoiding unpowered registers R/W; and > 3. Ignore the core_mask and ask the GPU to poweroff both core groups > > Of course it was also necessary to add a `irq` variable to `struct > panfrost_device` as we need to get that in panfrost_gpu_power_off() > for calling synchronize_irq() on it. > > Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()") > [Regression detected on Odroid HC1, Exynos5422, Mali-T628 MP6] > Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx> > Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx> > --- > > Changes in v2: > - Fixed the commit hash of "Really power off [...]" > - Actually based on a clean next-20231121 > - Renamed "irq" to "gpu_irq" as per Boris' suggestion > - Moved the IRQ mask/clear/sync to a helper function and added > a call to that in panfrost_device.c instead of doing that in > panfrost_gpu_power_off(). > > NOTE: I didn't split 1+2 from 3 as suggested by Boris, and I'm sending > this one without waiting for feedback on my reasons for that which I > explained as a reply to v1 because the former couldn't be applied to > linux-next, and I want to unblock Krzysztof ASAP to get this tested. > This does not compile. Best regards, Krzysztof