Re: [PATCH] drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21/11/2023 17:11, AngeloGioacchino Del Regno wrote:
> Il 21/11/23 16:34, Krzysztof Kozlowski ha scritto:
>> On 08/11/2023 14:20, Steven Price wrote:
>>> On 02/11/2023 14:15, AngeloGioacchino Del Regno wrote:
>>>> The layout of the registers {TILER,SHADER,L2}_PWROFF_LO, used to request
>>>> powering off cores, is the same as the {TILER,SHADER,L2}_PWRON_LO ones:
>>>> this means that in order to request poweroff of cores, we are supposed
>>>> to write a bitmask of cores that should be powered off!
>>>> This means that the panfrost_gpu_power_off() function has always been
>>>> doing nothing.
>>>>
>>>> Fix powering off the GPU by writing a bitmask of the cores to poweroff
>>>> to the relevant PWROFF_LO registers and then check that the transition
>>>> (from ON to OFF) has finished by polling the relevant PWRTRANS_LO
>>>> registers.
>>>>
>>>> While at it, in order to avoid code duplication, move the core mask
>>>> logic from panfrost_gpu_power_on() to a new panfrost_get_core_mask()
>>>> function, used in both poweron and poweroff.
>>>>
>>>> Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")
>>>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx>
>>
>>
>> Hi,
>>
>> This commit was added to next recently but it causes "external abort on
>> non-linefetch" during boot of my Odroid HC1 board.
>>
>> At least bisect points to it.
>>
>> If fixed, please add:
>>
>> Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx>
>>
>> [    4.861683] 8<--- cut here ---
>> [    4.863429] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf0c8802c
>> [    4.871018] [f0c8802c] *pgd=433ed811, *pte=11800653, *ppte=11800453
>> ...
>> [    5.164010]  panfrost_gpu_irq_handler from __handle_irq_event_percpu+0xcc/0x31c
>> [    5.171276]  __handle_irq_event_percpu from handle_irq_event+0x38/0x80
>> [    5.177765]  handle_irq_event from handle_fasteoi_irq+0x9c/0x250
>> [    5.183743]  handle_fasteoi_irq from generic_handle_domain_irq+0x28/0x38
>> [    5.190417]  generic_handle_domain_irq from gic_handle_irq+0x88/0xa8
>> [    5.196741]  gic_handle_irq from generic_handle_arch_irq+0x34/0x44
>> [    5.202893]  generic_handle_arch_irq from __irq_svc+0x8c/0xd0
>>
>> Full log:
>> https://krzk.eu/#/builders/21/builds/4392/steps/11/logs/serial0
>>
> 
> Hey Krzysztof,
> 
> This is interesting. It might be about the cores that are missing from the partial
> core_mask raising interrupts, but an external abort on non-linefetch is strange to
> see here.
> 
> Would you be available for some tests?
> 
> I'm thinking to call power_off on all cores (all shaders, all tilers, all l2s),
> regardless of what panfrost_get_core_mask() says, as it could be that your GPU
> powers on the cores that are unused by Panfrost by default, and that then we never
> turn them off, escalating to this issue.
> 
> If you can please please please test:
> 
> void panfrost_gpu_power_off(struct panfrost_device *pfdev)
> {
> 	u64 core_mask = panfrost_get_core_mask(pfdev);
> 	int ret;
> 	u32 val;
> 
> 	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
> 	gpu_write(pfdev, SHADER_PWROFF_HI, pfdev->features.shader_present >> 32);
> 	ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
> 					 val, !val, 1, 1000);
> 	if (ret)
> 		dev_err(pfdev->dev, "shader power transition timeout");
> 
> 	gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present);
> 	gpu_write(pfdev, TILER_PWROFF_HI, pfdev->features.tiler_present >> 32);
> 	ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO,
> 					 val, !val, 1, 1000);
> 	if (ret)
> 		dev_err(pfdev->dev, "tiler power transition timeout");
> 
> 	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
> 	ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
> 					 val, !val, 0, 1000);
> 	if (ret)
> 		dev_err(pfdev->dev, "l2 power transition timeout");
> 
> 	gpu_write(pfdev, L2_PWROFF_HI, pfdev->features.l2_present >> 32);
> 	ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_HI,
> 					 val, !val, 0, 1000);
> 	if (ret)
> 		dev_err(pfdev->dev, "l2 power transition timeout");
> }
> 

Send a diff please - against next or some other commit sha from next.

Best regards,
Krzysztof




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux