Hi, On Thu, 13 Jan 2022 at 12:07, Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote: > > Hi Alexander, > > alex.aring@xxxxxxxxx wrote on Wed, 12 Jan 2022 17:44:02 -0500: > > > Hi, > > > > On Wed, 12 Jan 2022 at 12:33, Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote: > > ... > > > + return 0; > > > +} > > > diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c > > > index c829e4a75325..40656728c624 100644 > > > --- a/net/mac802154/tx.c > > > +++ b/net/mac802154/tx.c > > > @@ -54,6 +54,9 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb) > > > struct net_device *dev = skb->dev; > > > int ret; > > > > > > + if (unlikely(mac802154_scan_is_ongoing(local))) > > > + return NETDEV_TX_BUSY; > > > + > > > > Please look into the functions "ieee802154_wake_queue()" and > > "ieee802154_stop_queue()" which prevent this function from being > > called. Call stop before starting scanning and wake after scanning is > > done or stopped. > > Mmmh all this is already done, isn't it? > - mac802154_trigger_scan_locked() stops the queue before setting the > promiscuous mode > - mac802154_end_of_scan() wakes the queue after resetting the > promiscuous mode to its original state > > Should I drop the check which stands for an extra precaution? > no, I think then it should be a WARN_ON() more without any return (hopefully it will survive). This case should never happen otherwise we have a bug that we wake the queue when we "took control about transmissions" only. Change the name, I think it will be in future not only scan related. Maybe "mac802154_queue_stopped()". Everything which is queued from socket/upperlayer(6lowpan) goes this way. > > But overall I think I don't understand well this part. What is > a bit foggy to me is why the (async) tx implementation does: > > *Core* *Driver* > > stop_queue() > drv_async_xmit() ------- > \------> do something > ------- calls ieee802154_xmit_complete() > wakeup_queue() <--------/ > > So we actually disable the queue for transmitting. Why?? > Because all transceivers have either _one_ transmit framebuffer or one framebuffer for transmit and receive one time. We need to report to stop giving us more skb's while we are busy with one to transmit. This all will/must be changed in future if there is hardware outside which is more powerful and the driver needs to control the flow here. That ieee802154_xmit_complete() calls wakeup_queue need to be forbidden when we are in "synchronous transmit mode"/the queue is stopped. The synchronous transmit mode is not for any hotpath, it's for MLME and I think we also need a per phy lock to avoid multiple synchronous transmissions at one time. Please note that I don't think here only about scan operation, also for other possible MLME-ops. > > Also there exists a race which exists in your way and also the one > > mentioned above. There can still be some transmissions going on... We > > need to wait until "all possible" tx completions are done... to be > > sure there are really no transmissions going on. However we need to be > > sure that a wake cannot be done if a tx completion is done, we need to > > avoid it when the scan operation is ongoing as a workaround for this > > race. > > > > This race exists and should be fixed in future work? > > Yep, this is true, do you have any pointers? Because I looked at the > code and for now it appears quite unpractical to add some kind of > flushing mechanism on that net queue. I believe we cannot use the netif > interface for that so we would have to implement our own mechanism in > the ieee802154 core. yes, we need some kind of "wait_for_completion()" and "complete()". We are currently lucky that we allow only one skb to be transmitted at one time. I think it is okay to put that on a per phy basis... - Alex