On 21/11/2023 17:11, AngeloGioacchino Del Regno wrote: > Il 21/11/23 16:34, Krzysztof Kozlowski ha scritto: >> On 08/11/2023 14:20, Steven Price wrote: >>> On 02/11/2023 14:15, AngeloGioacchino Del Regno wrote: >>>> The layout of the registers {TILER,SHADER,L2}_PWROFF_LO, used to request >>>> powering off cores, is the same as the {TILER,SHADER,L2}_PWRON_LO ones: >>>> this means that in order to request poweroff of cores, we are supposed >>>> to write a bitmask of cores that should be powered off! >>>> This means that the panfrost_gpu_power_off() function has always been >>>> doing nothing. >>>> >>>> Fix powering off the GPU by writing a bitmask of the cores to poweroff >>>> to the relevant PWROFF_LO registers and then check that the transition >>>> (from ON to OFF) has finished by polling the relevant PWRTRANS_LO >>>> registers. >>>> >>>> While at it, in order to avoid code duplication, move the core mask >>>> logic from panfrost_gpu_power_on() to a new panfrost_get_core_mask() >>>> function, used in both poweron and poweroff. >>>> >>>> Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") >>>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx> >> >> >> Hi, >> >> This commit was added to next recently but it causes "external abort on >> non-linefetch" during boot of my Odroid HC1 board. >> >> At least bisect points to it. >> >> If fixed, please add: >> >> Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx> >> >> [ 4.861683] 8<--- cut here --- >> [ 4.863429] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf0c8802c >> [ 4.871018] [f0c8802c] *pgd=433ed811, *pte=11800653, *ppte=11800453 >> ... >> [ 5.164010] panfrost_gpu_irq_handler from __handle_irq_event_percpu+0xcc/0x31c >> [ 5.171276] __handle_irq_event_percpu from handle_irq_event+0x38/0x80 >> [ 5.177765] handle_irq_event from handle_fasteoi_irq+0x9c/0x250 >> [ 5.183743] handle_fasteoi_irq from generic_handle_domain_irq+0x28/0x38 >> [ 5.190417] generic_handle_domain_irq from gic_handle_irq+0x88/0xa8 >> [ 5.196741] gic_handle_irq from generic_handle_arch_irq+0x34/0x44 >> [ 5.202893] generic_handle_arch_irq from __irq_svc+0x8c/0xd0 >> >> Full log: >> https://krzk.eu/#/builders/21/builds/4392/steps/11/logs/serial0 >> > > Hey Krzysztof, > > This is interesting. It might be about the cores that are missing from the partial > core_mask raising interrupts, but an external abort on non-linefetch is strange to > see here. > > Would you be available for some tests? > > I'm thinking to call power_off on all cores (all shaders, all tilers, all l2s), > regardless of what panfrost_get_core_mask() says, as it could be that your GPU > powers on the cores that are unused by Panfrost by default, and that then we never > turn them off, escalating to this issue. > > If you can please please please test: > > void panfrost_gpu_power_off(struct panfrost_device *pfdev) > { > u64 core_mask = panfrost_get_core_mask(pfdev); > int ret; > u32 val; > > gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present); > gpu_write(pfdev, SHADER_PWROFF_HI, pfdev->features.shader_present >> 32); > ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO, > val, !val, 1, 1000); > if (ret) > dev_err(pfdev->dev, "shader power transition timeout"); > > gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present); > gpu_write(pfdev, TILER_PWROFF_HI, pfdev->features.tiler_present >> 32); > ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO, > val, !val, 1, 1000); > if (ret) > dev_err(pfdev->dev, "tiler power transition timeout"); > > gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present); > ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO, > val, !val, 0, 1000); > if (ret) > dev_err(pfdev->dev, "l2 power transition timeout"); > > gpu_write(pfdev, L2_PWROFF_HI, pfdev->features.l2_present >> 32); > ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_HI, > val, !val, 0, 1000); > if (ret) > dev_err(pfdev->dev, "l2 power transition timeout"); > } > Send a diff please - against next or some other commit sha from next. Best regards, Krzysztof