Re: [PATCH v3 03/14] drm/panthor: Add the device logical block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 7 Dec 2023 10:23:55 +0000
Steven Price <steven.price@xxxxxxx> wrote:

> On 07/12/2023 08:56, Boris Brezillon wrote:
> > On Thu, 7 Dec 2023 09:12:43 +0100
> > Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> wrote:
> >   
> >> On Wed, 6 Dec 2023 16:55:42 +0000
> >> Steven Price <steven.price@xxxxxxx> wrote:
> >>  
> >>> On 04/12/2023 17:32, Boris Brezillon wrote:    
> >>>> The panthor driver is designed in a modular way, where each logical
> >>>> block is dealing with a specific HW-block or software feature. In order
> >>>> for those blocks to communicate with each other, we need a central
> >>>> panthor_device collecting all the blocks, and exposing some common
> >>>> features, like interrupt handling, power management, reset, ...
> >>>>
> >>>> This what this panthor_device logical block is about.
> >>>>
> >>>> v3:
> >>>> - Add acks for the MIT+GPL2 relicensing
> >>>> - Fix 32-bit support
> >>>> - Shorten the sections protected by panthor_device::pm::mmio_lock to fix
> >>>>   lock ordering issues.
> >>>> - Rename panthor_device::pm::lock into panthor_device::pm::mmio_lock to
> >>>>   better reflect what this lock is protecting
> >>>> - Use dev_err_probe()
> >>>> - Make sure we call drm_dev_exit() when something fails half-way in
> >>>>   panthor_device_reset_work()
> >>>> - Replace CSF_GPU_LATEST_FLUSH_ID_DEFAULT with a constant '1' and a
> >>>>   comment to explain. Also remove setting the dummy flush ID on suspend.
> >>>> - Remove drm_WARN_ON() in panthor_exception_name()
> >>>> - Check pirq->suspended in panthor_xxx_irq_raw_handler()
> >>>>
> >>>> Signed-off-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx>
> >>>> Signed-off-by: Steven Price <steven.price@xxxxxxx>
> >>>> Acked-by: Steven Price <steven.price@xxxxxxx> # MIT+GPL2 relicensing,Arm
> >>>> Acked-by: Grant Likely <grant.likely@xxxxxxxxxx> # MIT+GPL2 relicensing,Linaro
> >>>> Acked-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> # MIT+GPL2 relicensing,Collabora      
> >>>
> >>> We still have the "FIXME: this is racy"    
> >>
> >> Yeah, I still didn't find a proper solution for that.  
> > 
> > This [1] should fix the race condition in the unplug path.
> > 
> > [1]https://gitlab.freedesktop.org/panfrost/linux/-/commit/b79b28069e524ae7fea22a9a158b870eab2d5f9a  
> 
> Looks like it should do the job. I'm surprised that we're the only ones
> to face this though.

Most drivers just have one path where they call drm_dev_unplug():
the device removal callback, which is only called once per device. The
only exception where I see more than one occurrence are the amdgpu
and powervr drivers. I guess amdgpu has some tricks to serialize
_unplug operations, and powervr is probably buggy as you pointed out.

> 
> Looking at the new imagination driver it appears there's a similar problem:
> 
> pvr_device_lost() uses a boolean 'lost' to track multiple calls but that
> boolean isn't protected by any specific lock (AFAICT). Indeed
> pvr_device_lost() calls drm_dev_unplug() while in a drm_dev_enter()
> critical section (see pvr_mmu_flush_exec()). If I'm not mistaken that's
> the same problem we discussed and isn't allowed?

It is, indeed. That means there's a deadlock when pvr_device_lost() is
called from the MMU code. And I guess the race condition with
concurrent pvr_device_lost() callers exists too (unless I missed
something, and calls to pvr_device_lost() are serialized with another
lock).

> drm_dev_unplug() will
> synchronise on the SRCU that drm_dev_enter() is holding.
> 
> +CC: Frank, Donald, Matt from Imagination.
> 
> Steve
> 




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux