Search Linux Wireless

Re: Missing wiphy lock in ieee80211_color_collision_detection_work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 20, 2024 at 02:12:23PM +0200, Johannes Berg wrote:
> On Fri, 2024-09-20 at 14:10 +0200, Remi Pommarel wrote:
> > On Fri, Sep 20, 2024 at 02:07:19PM +0200, Remi Pommarel wrote:
> > > On Fri, Sep 20, 2024 at 12:00:08PM +0200, Johannes Berg wrote:
> > > > On Fri, 2024-09-20 at 10:41 +0200, Remi Pommarel wrote:
> > > > > > I think using timer_pending(&link->color_collision_detect_work->timer)
> > > > > > to replace delayed_work_pending(), even if the semantic is a bit
> > > > > > different, would be ok to fulfill the rate limiting purpose.
> > > > 
> > > > I think you're right. We could as well check list_empty() or so, but it
> > > > wouldn't make a big difference. As you say:
> > > > 
> > > > > > Having the
> > > > > > same delayed_work_pending() semantics on wiphy work queue would require
> > > > > > to take wiphy lock
> > > > 
> > > > To really have it known _precisely_, that's true.
> > > > 
> > > > > >  which seem a bit superfluous here.
> > > > 
> > > > It's actually simply also not possible - if we could sleep in
> > > > ieee80211_obss_color_collision_notify() and take the wiphy mutex, we'd
> > > > probably have done this completely differently :)
> > > > 
> > > > And a hypothetical wiphy_delayed_work_pending() API should therefore not
> > > > be required to be called with the wiphy mutex held.
> > > > 
> > > > I think that perhaps we should add such a trivial inline instead of
> > > > open-coding the timer_pending() check, but I'm not sure whether or not
> > > > it should also check the list (i.e. check if timer has expired, but work
> > > > hasn't started yet): on the one hand it seems more appropriate, and if
> > > > actually holding the wiphy mutex it would in fact be completely correct,
> > > > on the other hand maybe it encourages being sloppily thinking the return
> > > > value is always perfect?
> > > > 
> > > > Right now I'd tend to have the check and document that it's only
> > > > guaranteed when under the wiphy mutex.
> > > 
> > > I had this exact train of thought and was replying that I did agree on
> > > that, and then I checked the other *_pending semantics for which I tend
> > > to forget the details.
> 
> Haha, yes, no kidding.
> 
> > > IIUC timer_pending() and work_pending() can both
> > > return false if callback is still running (or even not yet started).
> > > Thus the hypothetical wiphy_work_pending() semantics could allow to
> > > implement it using timer_pending().
> > > 
> > > Adding a list_empty() check while making it more "precise" does not make
> > > it "perfect" (even if there is no clear notion of what perfection should
> > > be here) even with wiphy lock held. Indeed if wiphy_work_pending() is
> > > called precisely after delayed work timer has cleared its pending state
> > > but before the callback (i.e wiphy_delayed_work_timer()) has added the
> > > work in the list both !timer_pending() and list_empty(work->entry) would
> > > be true. So there is a small window where wiphy_work_pending() wouldn't
> > > be more precise than just checking timer_pending() as show below:
> > > 
> > >       CPU0                               CPU1
> > >  expire_timers
> > >    detach_timer
> > >                                     wiphy_work_pending() -> return false
> > >    timer->function
> > >      wiphy_work_queue
> > >        list_add_tail()
> > >                                     wiphy_work_pending() -> return false
> > 
> > I meant wiphy_work_pending() -> return true here ^.
> > 
> Good point! The second call could even be before the list_add_tail and
> still return false, so yeah, wiphy lock doesn't do anything.
> 
> But I guess we can live with both of these too, given sufficient
> documentation :)

Also, sorry for being bothersome, wiphy_delayed_work_queue() is not a
genuine equivalent of queue_delayed_work() as the latter does not
requeue work if already scheduled while the former does so effectively
changing the delayed work deadline (very mod_delayed_work() alike).

If we wanted to fix that, ieee80211_obss_color_collision_notify() would
simply use wiphy_delayed_work_queue() directly. But other calls would
have to be rechecked and switched to use wiphy_delayed_work_mod() if
needed be.

> 
> Agree also with Nicolas though that we could just ratelimit this
> differently too, this was just a convenient way of achieving it with the
> existing infrastructure.

Yes that would work also but still need to convert into wiphy work the
color collision work to avoid deadlock.

-- 
Remi




[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux