Re: [PATCH v2 05/12] drm/panfrost: Disable the AS on unhandled page faults

Steven Price <steven.price@xxxxxxx> · Mon, 21 Jun 2021 16:09:32 +0100

On 21/06/2021 14:39, Boris Brezillon wrote:
> If we don't do that, we have to wait for the job timeout to expire
> before the fault jobs gets killed.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx>

Don't we need to do something here to allow recovery of the MMU context
in the future? panfrost_mmu_disable() will zero out the MMU registers on
the hardware, but AFAICS panfrost_mmu_enable() won't be called to
restore the values until something evicts the address space (GPU power
down/reset or just too many other processes).

The ideal would be to block submission of new jobs from this context and
then wait until existing jobs have completed at which point the MMU
state can be restored and jobs allowed again.

But at a minimum I think we should have something like an 'MMU poisoned'
bit that panfrost_mmu_as_get() can check.

Steve

> ---
>  drivers/gpu/drm/panfrost/panfrost_mmu.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 2a9bf30edc9d..d5c624e776f1 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -661,7 +661,7 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data)
>  		if ((status & mask) == BIT(as) && (exception_type & 0xF8) == 0xC0)
>  			ret = panfrost_mmu_map_fault_addr(pfdev, as, addr);
>  
> -		if (ret)
> +		if (ret) {
>  			/* terminal fault, print info about the fault */
>  			dev_err(pfdev->dev,
>  				"Unhandled Page fault in AS%d at VA 0x%016llX\n"
> @@ -679,6 +679,10 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data)
>  				access_type, access_type_name(pfdev, fault_status),
>  				source_id);
>  
> +			/* Disable the MMU to stop jobs on this AS immediately */
> +			panfrost_mmu_disable(pfdev, as);
> +		}
> +
>  		status &= ~mask;
>  
>  		/* If we received new MMU interrupts, process them before returning. */
>