On 26-4-2017 9:16, Johannes Berg wrote: > On Fri, 2017-04-21 at 13:05 +0100, Arend van Spriel wrote: >> Have proper request id filled in the SCHED_SCAN_RESULTS and >> SCHED_SCAN_STOPPED notifications toward user-space by having the >> driver provide it through the api. >> >> Reviewed-by: Hante Meuleman <hante.meuleman@xxxxxxxxxxxx> >> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@xxxxxxxxxxx >> m> >> Reviewed-by: Franky Lin <franky.lin@xxxxxxxxxxxx> > > All your reviewers aren't paying attention ;-) > >> @@ -1722,6 +1723,7 @@ struct cfg80211_sched_scan_request { >> struct cfg80211_bss_select_adjust rssi_adjust; >> >> /* internal */ >> + struct work_struct results_wk; >> struct wiphy *wiphy; >> struct net_device *dev; >> unsigned long scan_start; > > Superficially, this is fine - but you need to think hard about the > semantics of when you cancel this work. > >> +++ b/net/wireless/scan.c >> @@ -369,15 +369,13 @@ void __cfg80211_sched_scan_results(struct >> work_struct *wk) >> struct cfg80211_registered_device *rdev; >> struct cfg80211_sched_scan_request *request; >> >> - rdev = container_of(wk, struct cfg80211_registered_device, >> - sched_scan_results_wk); >> + request = container_of(wk, struct >> cfg80211_sched_scan_request, results_wk); >> + rdev = wiphy_to_rdev(request->wiphy); >> >> rtnl_lock(); >> >> - request = cfg80211_find_sched_scan_req(rdev, 0); >> - >> /* we don't have sched_scan_req anymore if the scan is >> stopping */ > > That comment is kinda wrong now, afaict? Also you return > >> - if (!IS_ERR(request)) { >> + if (request) { > > This seems wrong - you do return an ERR_PTR() from find. Not that > there's all that much point in doing so vs. returning NULL, might be > worth cleaning it up. Indeed. Not sure if it was me being sloppy/confused or due to rebasing the patches. Anyway, I will fix this. >> -void cfg80211_sched_scan_results(struct wiphy *wiphy) >> +void cfg80211_sched_scan_results(struct wiphy *wiphy, u64 reqid) >> { >> - struct cfg80211_registered_device *rdev = >> wiphy_to_rdev(wiphy); >> struct cfg80211_sched_scan_request *request; >> >> - trace_cfg80211_sched_scan_results(wiphy); >> + trace_cfg80211_sched_scan_results(wiphy, reqid); >> /* ignore if we're not scanning */ >> >> rtnl_lock(); >> - request = cfg80211_find_sched_scan_req(rdev, 0); >> + request = cfg80211_find_sched_scan_req(wiphy_to_rdev(wiphy), >> reqid); >> rtnl_unlock(); >> >> if (!IS_ERR(request)) >> - queue_work(cfg80211_wq, >> - &wiphy_to_rdev(wiphy)- >>> sched_scan_results_wk); >> + queue_work(cfg80211_wq, &request->results_wk); >> + else >> + wiphy_err(wiphy, "reqid %llu not found\n", reqid); >> } > > This seems problematic - you use the request pointer outside the > locking now. I'd move that rtnl_unlock() to afterwards. The message > could also mention sched scan, that might be useful. Although I suspect > that may happen due to races (e.g. netlink socket close vs. firmware > stop) so maybe it's not all that useful. When adding the worker in the request I was thinking about what might happen between the queue_work and the work actually being scheduled. > Most importantly though, you don't protect against queuing the work > here, and then deleting the request. In the old case the comment that I > pointed out above will save us, but in this new case it can't (the > comment is now wrong), and that's a problem. > > I looked briefly at this and I suspect it will be very hard to fix that > with a per-request work struct. It might be better to have a per-work > status flag that you set here, then you schedule the global work, and > that global work will send all the appropriate result messages after > clearing the status bit again, similar to what I did with the netlink > destroy now. I guess it is in you repo as I did not see anything related on the mailing list. So regarding this rework, do you want me to send the series again or just this patch? Regards, Arend