Hi, On Wed, May 18, 2022 at 8:37 AM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote: > > > alex.aring@xxxxxxxxx wrote on Wed, 18 May 2022 08:05:46 -0400: > > > Hi, > > > > On Wed, May 18, 2022 at 6:12 AM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote: > > > > > > > > > aahringo@xxxxxxxxxx wrote on Tue, 17 May 2022 21:14:03 -0400: > > > > > > > Hi, > > > > > > > > On Tue, May 17, 2022 at 9:30 AM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote: > > > > > > > > > > > > > > > aahringo@xxxxxxxxxx wrote on Sun, 15 May 2022 19:03:53 -0400: > > > > > > > > > > > Hi, > > > > > > > > > > > > On Sun, May 15, 2022 at 6:28 PM Alexander Aring <aahringo@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > On Thu, May 12, 2022 at 10:34 AM Miquel Raynal > > > > > > > <miquel.raynal@xxxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > This is the slow path, we need to wait for each command to be processed > > > > > > > > before continuing so let's introduce an helper which does the > > > > > > > > transmission and blocks until it gets notified of its asynchronous > > > > > > > > completion. This helper is going to be used when introducing scan > > > > > > > > support. > > > > > > > > > > > > > > > > Signed-off-by: Miquel Raynal <miquel.raynal@xxxxxxxxxxx> > > > > > > > > --- > > > > > > > > net/mac802154/ieee802154_i.h | 1 + > > > > > > > > net/mac802154/tx.c | 25 +++++++++++++++++++++++++ > > > > > > > > 2 files changed, 26 insertions(+) > > > > > > > > > > > > > > > > diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h > > > > > > > > index a057827fc48a..f8b374810a11 100644 > > > > > > > > --- a/net/mac802154/ieee802154_i.h > > > > > > > > +++ b/net/mac802154/ieee802154_i.h > > > > > > > > @@ -125,6 +125,7 @@ extern struct ieee802154_mlme_ops mac802154_mlme_wpan; > > > > > > > > void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb); > > > > > > > > void ieee802154_xmit_sync_worker(struct work_struct *work); > > > > > > > > int ieee802154_sync_and_hold_queue(struct ieee802154_local *local); > > > > > > > > +int ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb); > > > > > > > > netdev_tx_t > > > > > > > > ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev); > > > > > > > > netdev_tx_t > > > > > > > > diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c > > > > > > > > index 38f74b8b6740..ec8d872143ee 100644 > > > > > > > > --- a/net/mac802154/tx.c > > > > > > > > +++ b/net/mac802154/tx.c > > > > > > > > @@ -128,6 +128,31 @@ int ieee802154_sync_and_hold_queue(struct ieee802154_local *local) > > > > > > > > return ieee802154_sync_queue(local); > > > > > > > > } > > > > > > > > > > > > > > > > +int ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb) > > > > > > > > +{ > > > > > > > > + int ret; > > > > > > > > + > > > > > > > > + /* Avoid possible calls to ->ndo_stop() when we asynchronously perform > > > > > > > > + * MLME transmissions. > > > > > > > > + */ > > > > > > > > + rtnl_lock(); > > > > > > > > > > > > > > I think we should make an ASSERT_RTNL() here, the lock needs to be > > > > > > > earlier than that over the whole MLME op. MLME can trigger more than > > > > > > > > > > > > not over the whole MLME_op, that's terrible to hold the rtnl lock so > > > > > > long... so I think this is fine that some netdev call will interfere > > > > > > with this transmission. > > > > > > So forget about the ASSERT_RTNL() here, it's fine (I hope). > > > > > > > > > > > > > one message, the whole sync_hold/release queue should be earlier than > > > > > > > that... in my opinion is it not right to allow other messages so far > > > > > > > an MLME op is going on? I am not sure what the standard says to this, > > > > > > > but I think it should be stopped the whole time? All those sequence > > > > > > > > > > > > Whereas the stop of the netdev queue makes sense for the whole mlme-op > > > > > > (in my opinion). > > > > > > > > > > I might still implement an MLME pre/post helper and do the queue > > > > > hold/release calls there, while only taking the rtnl from the _tx. > > > > > > > > > > And I might create an mlme_tx_one() which does the pre/post calls as > > > > > well. > > > > > > > > > > Would something like this fit? > > > > > > > > I think so, I've heard for some transceiver types a scan operation can > > > > take hours... but I guess whoever triggers that scan in such an > > > > environment knows that it has some "side-effects"... > > > > > > Yeah, a scan requires the data queue to be stopped and all incoming > > > packets to be dropped (others than beacons, ofc), so users must be > > > aware of this limitation. > > > > I think there is a real problem about how the user can synchronize the > > start of a scan and be sure that at this point everything was > > transmitted, we might need to real "flush" the queue. Your naming > > "flush" is also wrong, It will flush the framebuffer(s) of the > > transceivers but not the netdev queue... and we probably should flush > > the netdev queue before starting mlme-op... this is something to add > > in the mlme_op_pre() function. > > Is it even possible? This requires waiting for the netdev queue to be > empty before stopping it, but if users constantly flood the transceiver > with data packets this might "never" happen. > Nothing is impossible, just maybe nobody thought about that. Sure putting more into the queue should be forbidden but what's inside should be "flushed". Currently we make a hard cut, there is no way that the user knows what's sent or not BUT that is the case for xmit_do() anyway, it's not reliable... people need to have the right upper layer protocol. However I think we could run into problems if we especially have features like waiting for the socket error queue to know if e.g. an ack was received or not. > And event thought we might accept this situation, I don't know how to > check the emptiness of the netif queue. Any inputs? Don't think about it, I see a practical issue here which I keep in my mind. - Alex