Focusing on this part for a moment, because it affects not just wireless: > Then, in netif_carrier_on(), we immediately set the carrier on bit, so > that you can actually immediately see this from userspace if you ask > rtnetlink, however, it's not actually immediately _functional_ - it > still needs to schedule and run the linkwatch work first, to call > dev_activate() to change the (TX queue) qdisc(s) away from noop. > > Also, even though you can already query the carrier state and see it on, > the actual rtnetlink _event_ for this only happens from the linkwatch > work as well, via netdev_state_change(). > > All of this makes sense since you need to hold RTNL for all those state > changes/notifier chains, but it does lead to the first race/consistency > problem: if you query at just the right time you can see carrier being > on, however, if the carrier is actually removed again and the linkwatch > work didn't run yet, there might never be an event for the carrier on, > iow, you might have: > > netif_carrier_on() > query from userspace and see carrier on > netif_carrier_off() > linkwatch work runs and sends only carrier off event and also because, as Andrew mentioned, you can have the exact opposite problem... It can actually happen that something _else_ sends an event, so even if userspace does't query but waits for a carrier on event, you could end up with: * netif_carrier_on() * something else triggers netdev_state_change(), even userspace setting link alias netdev_state_change() -> sends an rtnetlink event saying carrier is on * userspace transmits but frames are dropped * linkwatch work runs and enables qdiscs only now To address this issue, we could introduce a new state, say __LINK_STATE_CARRIER_COMPLETE or something like that, which is used when communicating carrier state to userspace, and only set/cleared in dev_activate()/dev_deactivate(). That way, any events to userspace (or userspace querying) wouldn't show the carrier state until it's actually fully reflected in software (qdiscs) too. This doesn't fully solve _my_ (wifi) problem, but perhaps lets me work around it in userspace by querying for the carrier state, if it's reflected correctly ("fully ready to transmit") then we can do that. Right now, we can't even do that. But would it break something else? There's also a way to query it via ethtool, which perhaps should _not_ be converted to _COMPLETE since you could argue the ethtool link state is about the physical link, and with this change the carrier becomes about the logical link in a fashion? johannes