Re: [PATCH wpan-next v2 0/2] ieee802154: Beaconing support

Alexander Aring <aahringo@xxxxxxxxxx> · Sun, 5 Feb 2023 20:41:00 -0500

Hi,

On Fri, Feb 3, 2023 at 10:19 AM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote:
>
> Hi Alexander,
>
> aahringo@xxxxxxxxxx wrote on Wed, 1 Feb 2023 12:15:42 -0500:
>
> > Hi,
> >
> > On Tue, Jan 31, 2023 at 6:25 AM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote:
> > >
> > > Hi Alexander,
> > >
> > > > > > > > Changes in v2:
> > > > > > > > * Clearly state in the commit log llsec is not supported yet.
> > > > > > > > * Do not use mlme transmission helpers because we don't really need to
> > > > > > > >   stop the queue when sending a beacon, as we don't expect any feedback
> > > > > > > >   from the PHY nor from the peers. However, we don't want to go through
> > > > > > > >   the whole net stack either, so we bypass it calling the subif helper
> > > > > > > >   directly.
> > > > > > > >
> > > > > >
> > > > > > moment, we use the mlme helpers to stop tx
> > > > >
> > > > > No, we no longer use the mlme helpers to stop tx when sending beacons
> > > > > (but true MLME transmissions, we ack handling and return codes will be
> > > > > used for other purposes).
> > > > >
> > > >
> > > > then we run into an issue overwriting the framebuffer while the normal
> > > > transmit path is active?
> > >
> > > Crap, yes you're right. That's not gonna work.
> > >
> > > The net core acquires HARD_TX_LOCK() to avoid these issues and we are
> > > no bypassing the net core without taking care of the proper frame
> > > transmissions either (which would have worked with mlme_tx_one()). So I
> > > guess there are two options:
> > >
> > > * Either we deal with the extra penalty of stopping the queue and
> > >   waiting for the beacon to be transmitted with an mlme_tx_one() call,
> > >   as proposed initially.
> > >
> > > * Or we hardcode our own "net" transmit helper, something like:
> > >
> > > mac802154_fast_mlme_tx() {
> > >         struct net_device *dev = skb->dev;
> > >         struct netdev_queue *txq;
> > >
> > >         txq = netdev_core_pick_tx(dev, skb, NULL);
> > >         cpu = smp_processor_id();
> > >         HARD_TX_LOCK(dev, txq, cpu);
> > >         if (!netif_xmit_frozen_or_drv_stopped(txq))
> > >                 netdev_start_xmit(skb, dev, txq, 0);
> > >         HARD_TX_UNLOCK(dev, txq);
> > > }
> > >
> > > Note1: this is very close to generic_xdp_tx() which tries to achieve the
> > > same goal: sending packets, bypassing qdisc et al. I don't know whether
> > > it makes sense to define it under mac802154/tx.c or core/dev.c and give
> > > it another name, like generic_tx() or whatever would be more
> > > appropriate. Or even adapting generic_xdp_tx() to make it look more
> > > generic and use that function instead (without the xdp struct pointer).
> > >
> >
> > The problem here is that the transmit handling is completely
> > asynchronous. Calling netdev_start_xmit() is not "transmit and wait
> > until transmit is done", it is "start transmit here is the buffer" an
> > interrupt is coming up to report transmit is done. Until the time the
> > interrupt isn't arrived the framebuffer on the device is in use, we
> > don't know when the transceiver is done reading it. Only after tx done
> > isr. The time until the isr isn't arrived is for us a -EBUSY case due
> > hardware resource limitation. Currently we do that with stop/wake
> > queue to avoid calling of xmit_do() to not run into such -EBUSY
> > cases...
> >
> > There might be clever things to do here to avoid this issue... I am
> > not sure how XDP does that.
> >
> > > Note2: I am wondering if it makes sense to disable bh here as well?
> >
> > May HARD_TX_LOCK() already do that? If they use spin_lock_bh() they
> > disable local softirqs until the lock isn't held anymore.
>
> I saw a case where both are called so I guess the short answer is "no":
> https://elixir.bootlin.com/linux/latest/source/net/core/dev.c#L4307
>
> >
> > >
> > > Once we settle, I send a patch.
> > >
> >
> > Not sure how to preceded here, but do see the problem? Or maybe I
> > overlooked something here...
>
> No you clearly had a sharp eye on that one, I totally see the problem.
>
> Maybe the safest and simplest approach would be to be back using
> the proper mlme transmission helpers for beacons (like in the initial
> proposal).

ok.

> TBH I don't think there is a huge performance hit because in
> both cases we wait for that ISR saying "the packet has been consumed by
> the transceiver". It's just that in one case we wait for the return
> code (MLME) and then return, in the other case we return but no
> more packets will go through until the queue is released by the ISR (as
> you said, in order to avoid the -EBUSY case). So in practice I don't
> expect any performance hit. It is true however that we might want to
> optimize this a little bit if we ever add something like an async
> callback saying "skb consumed by the transceiver, another can be
> queued" and gain a few us. Maybe a comment could be useful here (I'll
> add it to my fix if we agree).

the future will show how things work out here. I am fine with the
initial proposal.

- Alex