Re: [PATCH v2 0/4] Fix device_lock deadlock on two probe() paths

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 18, 2023 at 08:24:01PM +0200, Joerg Roedel wrote:
> On Fri, Aug 18, 2023 at 01:06:43PM -0300, Jason Gunthorpe wrote:
> > On Fri, Aug 18, 2023 at 05:56:20PM +0200, Joerg Roedel wrote:
> > > On Thu, Aug 17, 2023 at 03:33:16PM -0300, Jason Gunthorpe wrote:
> > > > Bascially.. Yikes!
> > > 
> > > Hmm, that is a difficult situation. Even if the problem is a misuse of
> > > the APIs we can not just blindly break other drivers by our core
> > > changes.
> > 
> > They are not broken, they just throw a lockdep warning and keep going
> > as before. This is what triggers:
> > 
> > static inline void device_lock_assert(struct device *dev)
> > {
> > 	lockdep_assert_held(&dev->mutex);
> > }
> > 
> > So non-debug builds won't even see anything.
> 
> But this still means that a function is called without holding the
> proper lock.

It has alway been like that. of_dma_configure_id() always required the
device_lock to be held!

eg as one example:

 of_iommu_configure
  of_iommu_configure_device
   of_iommu_configure_dev
    of_iommu_xlate
     iommu_fwspec_init
       dev_iommu_get

It is subtle, but the device_lock is what protects the store to
dev->iommu inside dev_iommu_get(). In v6.5-rc1 many callers held the
device lock here, and after this series only these broken drivers
don't.

The driver assumes it has exclusive use of the platform device it
steals. Not just for this call, but in general it does other stuff
that rests on this assumption. Since it is exclusive it doesn't
actually need any locking - this is why it works reliably as is today.

Again, there is no practical bug here, the driver works fine. On
non-debug kernels there is no warning or functional issue. Debug
kernels rightly highlight that the API is being used wrong.

It is wrong use of the APIs because someone could go and use sysfs to
attach, say, VFIO to the stolen platform_device and cause all kinds of
kernel problems.

> I can't send anything with known problems upstream.

In my view this is not creating a new problem. It is exposing existing
problems with a debugging message only on debug kernels.

However, if you view the debug message as a problem then I suggest we
simply comment out with a note the device_lock_assert() from the iommu
code. This would be sad because the vast majority of systems don't use
these badly designed drivers.

This puts it back to the v6.5-rc1 behavior where of_dma_configure_id()
won't make any prints if it is called with wrong locking.

Please let me know.

Jason



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux