On 29.08.2024 15:57, Liviu Dudau wrote: > Hi Adrián, > > On Sat, Aug 17, 2024 at 11:10:17AM +0200, Boris Brezillon wrote: > > On Fri, 16 Aug 2024 19:52:49 +0100 > > Adrián Larumbe <adrian.larumbe@xxxxxxxxxxxxx> wrote: > > > > > In the off-chance that waiting for the firmware to signal its booted status > > > timed out in the fast reset path, one must flush the cache lines for the > > > entire FW VM address space before reloading the regions, otherwise stale > > > values eventually lead to a scheduler job timeout. > > > > > > Signed-off-by: Adrián Larumbe <adrian.larumbe@xxxxxxxxxxxxx> > > > > We probably want Fixes/Cc-stable tags here. > > > > > --- > > > drivers/gpu/drm/panthor/panthor_fw.c | 8 +++++++- > > > drivers/gpu/drm/panthor/panthor_mmu.c | 19 ++++++++++++++++--- > > > drivers/gpu/drm/panthor/panthor_mmu.h | 1 + > > > 3 files changed, 24 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c > > > index 857f3f11258a..ef232c0c2049 100644 > > > --- a/drivers/gpu/drm/panthor/panthor_fw.c > > > +++ b/drivers/gpu/drm/panthor/panthor_fw.c > > > @@ -1089,6 +1089,12 @@ int panthor_fw_post_reset(struct panthor_device *ptdev) > > > panthor_fw_stop(ptdev); > > > ptdev->fw->fast_reset = false; > > > drm_err(&ptdev->base, "FW fast reset failed, trying a slow reset"); > > > + > > > + ret = panthor_vm_flush_all(ptdev->fw->vm); > > > + if (ret) { > > > + drm_err(&ptdev->base, "FW slow reset failed (couldn't flush FW's AS l2cache)"); > > > + return ret; > > > + } > > > } > > > > > > /* Reload all sections, including RO ones. We're not supposed > > > @@ -1099,7 +1105,7 @@ int panthor_fw_post_reset(struct panthor_device *ptdev) > > > > > > ret = panthor_fw_start(ptdev); > > > if (ret) { > > > - drm_err(&ptdev->base, "FW slow reset failed"); > > > + drm_err(&ptdev->base, "FW slow reset failed (couldn't start the FW )"); > > > return ret; > > > } > > > > > > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c > > > index d47972806d50..a77ee5ce691d 100644 > > > --- a/drivers/gpu/drm/panthor/panthor_mmu.c > > > +++ b/drivers/gpu/drm/panthor/panthor_mmu.c > > > @@ -874,14 +874,27 @@ static int panthor_vm_flush_range(struct panthor_vm *vm, u64 iova, u64 size) > > > if (!drm_dev_enter(&ptdev->base, &cookie)) > > > return 0; > > > > > > - /* Flush the PTs only if we're already awake */ > > > - if (pm_runtime_active(ptdev->base.dev)) > > > - ret = mmu_hw_do_operation(vm, iova, size, AS_COMMAND_FLUSH_PT); > > > + /* > > > + * If we made it this far, that means the device is awake, because > > > + * upon device suspension, all active VMs are given an AS id of -1 > > > + */ > > > + ret = mmu_hw_do_operation(vm, iova, size, AS_COMMAND_FLUSH_PT); > > > > I would normally prefer this change to be in its own commit, but given > > this is needed to be able to flush caches in the resume path, I'm fine > > keeping it in the same patch. The comment is a bit odd now that you > > dropped the pm_runtime_active() call though. I would rather have a > > comment in mmu_hw_do_operation_locked(), after the AS ID check > > explaining that as.id >= 0 guarantees that the HW is up and running, > > and that we can proceed with the flush operation without calling > > pm_runtime_active(). > > Given Boris' comments, are you planning on sending a v2? Sure, I got caught up in a bunch of higher priority tasks this week, but I'll be sending a v2 either before the end of this week or early next week. > Best regards, > Liviu > > > > > > > > > drm_dev_exit(cookie); > > > return ret; > > > } > > > > > > +/** > > > + * panthor_vm_flush_all() - Flush L2 caches for the entirety of a VM's AS > > > + * @vm: VM whose cache to flush > > > + * > > > + * Return: 0 on success, a negative error code if flush failed. > > > + */ > > > +int panthor_vm_flush_all(struct panthor_vm *vm) > > > +{ > > > + return panthor_vm_flush_range(vm, vm->base.mm_start, vm->base.mm_range); > > > +} > > > + > > > static int panthor_vm_unmap_pages(struct panthor_vm *vm, u64 iova, u64 size) > > > { > > > struct panthor_device *ptdev = vm->ptdev; > > > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.h b/drivers/gpu/drm/panthor/panthor_mmu.h > > > index f3c1ed19f973..6788771071e3 100644 > > > --- a/drivers/gpu/drm/panthor/panthor_mmu.h > > > +++ b/drivers/gpu/drm/panthor/panthor_mmu.h > > > @@ -31,6 +31,7 @@ panthor_vm_get_bo_for_va(struct panthor_vm *vm, u64 va, u64 *bo_offset); > > > int panthor_vm_active(struct panthor_vm *vm); > > > void panthor_vm_idle(struct panthor_vm *vm); > > > int panthor_vm_as(struct panthor_vm *vm); > > > +int panthor_vm_flush_all(struct panthor_vm *vm); > > > > > > struct panthor_heap_pool * > > > panthor_vm_get_heap_pool(struct panthor_vm *vm, bool create); > > Adrian Larumbe