Search Linux Wireless

Re: 6.7.0-rc1 + hacks deadlock bug, wifi netdev delete + cat of debugfs file.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2023-11-08 at 07:55 -0800, Ben Greear wrote:
> On 11/8/23 7:44 AM, Johannes Berg wrote:
> > On Wed, 2023-11-08 at 07:07 -0800, Ben Greear wrote:
> > > On 11/8/23 2:31 AM, Johannes Berg wrote:
> > > > On Tue, 2023-11-07 at 14:08 -0800, Ben Greear wrote:
> > > > > Hello,
> > > > > 
> > > > > I think this lockup is because iw is holding rtnl and wiphy mutex,
> > > > > and is blocked waiting for debugfs to be closed.  Another 'cat'
> > > > > program has debugfs file open, and is blocking on trying to acquire
> > > > > wiphy mutex.
> > > > > 
> > > > > I think we must not acquire wiphy mutex in debugfs methods, somehow,
> > > > > to resolve this deadlock.  I do not know a safe way to do that.
> > > > 
> > > > Hmm. I almost want to say "don't do that then", but I guess you're just
> > > > randomly accessing debugfs files.
> > > > 
> > > > I guess we can at least make the mutex acquisition in debugfs killable
> > > > (or interruptible), so you can recover from this.
> > > 
> > > If we can detect that the phy is going away in debugfs, then we could
> > > return early before attempting the lock?  That would catch most things,
> > > I guess,
> > > 
> > 
> > I don't think it would, it would still get locked on the mutex first.
> > 
> > >   but still a potential race since I guess we'd have to do that check
> > > w/out locks.  Can we do a try-mutex-lock, if not acquired, return if wiphy-going-away,
> > > else sleep a bit, try again?
> > 
> > That's kind of awful though? And it's not just the wiphy going away, a
> > lot of the debugfs files can go away individually (per station, per
> > link, per key even!).
> 
>  From the backtrace in the removal logic, it seems that something waits
> for a debugfs file to be closed.

Yes, debugfs remove waits for it to no longer have active users, but
that cannot succeed because the users are blocked on acquiring the
mutex.

> Maybe the logic attempting to get the
> mutex in debugfs can check if file is waiting to be deleted,
> combined with a try-mutex-lock logic, and bail out that way?

I don't know if there's a way to check that, but I'm also not sure how
you'd even implement that?

johannes




[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux