Thanks for the bug report! Donald On Thu, 2023-12-07 at 10:23 +0000, Steven Price wrote: > *** CAUTION: This email originates from a source not known to Imagination Technologies. Think before you click a link or open an attachment *** > > On 07/12/2023 08:56, Boris Brezillon wrote: > > On Thu, 7 Dec 2023 09:12:43 +0100 > > Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> wrote: > > > > > On Wed, 6 Dec 2023 16:55:42 +0000 > > > Steven Price <steven.price@xxxxxxx> wrote: > > > > > > > On 04/12/2023 17:32, Boris Brezillon wrote: > > > > > The panthor driver is designed in a modular way, where each logical > > > > > block is dealing with a specific HW-block or software feature. In order > > > > > for those blocks to communicate with each other, we need a central > > > > > panthor_device collecting all the blocks, and exposing some common > > > > > features, like interrupt handling, power management, reset, ... > > > > > > > > > > This what this panthor_device logical block is about. > > > > > > > > > > v3: > > > > > - Add acks for the MIT+GPL2 relicensing > > > > > - Fix 32-bit support > > > > > - Shorten the sections protected by panthor_device::pm::mmio_lock to fix > > > > > lock ordering issues. > > > > > - Rename panthor_device::pm::lock into panthor_device::pm::mmio_lock to > > > > > better reflect what this lock is protecting > > > > > - Use dev_err_probe() > > > > > - Make sure we call drm_dev_exit() when something fails half-way in > > > > > panthor_device_reset_work() > > > > > - Replace CSF_GPU_LATEST_FLUSH_ID_DEFAULT with a constant '1' and a > > > > > comment to explain. Also remove setting the dummy flush ID on suspend. > > > > > - Remove drm_WARN_ON() in panthor_exception_name() > > > > > - Check pirq->suspended in panthor_xxx_irq_raw_handler() > > > > > > > > > > Signed-off-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> > > > > > Signed-off-by: Steven Price <steven.price@xxxxxxx> > > > > > Acked-by: Steven Price <steven.price@xxxxxxx> # MIT+GPL2 relicensing,Arm > > > > > Acked-by: Grant Likely <grant.likely@xxxxxxxxxx> # MIT+GPL2 relicensing,Linaro > > > > > Acked-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> # MIT+GPL2 relicensing,Collabora > > > > > > > > We still have the "FIXME: this is racy" > > > > > > Yeah, I still didn't find a proper solution for that. > > > > This [1] should fix the race condition in the unplug path. > > > > [1]https://urldefense.com/v3/__https://gitlab.freedesktop.org/panfrost/linux/-/commit/b79b28069e524ae7fea22a9a158b870eab2d5f9a__;!!KCwjcDI!1x9mqPx9K2SprWdgzBBRExLDu8uVajVeJcmlMkMufmIqJi5TYLqiDhhBr1hlnBQQUVgHnJKnWInjn7rWq0H_iLg$ > > Looks like it should do the job. I'm surprised that we're the only ones > to face this though. > > Looking at the new imagination driver it appears there's a similar problem: > > pvr_device_lost() uses a boolean 'lost' to track multiple calls but that > boolean isn't protected by any specific lock (AFAICT). Indeed > pvr_device_lost() calls drm_dev_unplug() while in a drm_dev_enter() > critical section (see pvr_mmu_flush_exec()). If I'm not mistaken that's > the same problem we discussed and isn't allowed? drm_dev_unplug() will > synchronise on the SRCU that drm_dev_enter() is holding. > > +CC: Frank, Donald, Matt from Imagination. > > Steve >