Search Linux Wireless

Re: [PATCH v2] mac80211: Fix a race on enabling power save.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 4, 2011 at 6:42 PM, Johannes Berg <johannes@xxxxxxxxxxxxxxxx> wrote:
> On Fri, 2011-02-04 at 14:08 +0100, Johannes Berg wrote:
>> On Fri, 2011-02-04 at 17:55 +0530, Vivek Natarajan wrote:
>> > There is a race on sending a data frame before the tx completion
>> > of nullfunc frame for enabling power save. As the data quickly
>> > follows the nullfunc frame, the AP thinks that the station is out
>> > of power save and continues to send the frames. Whereas in the
>> > station, the nullfunc ack will be processed after the tx completion
>> > of data frame and mac80211 goes to powersave. Thus the power
>> > save state mismatch between the station and the AP causes some
>> > data loss and some applications fail because of that. This patch
>> > fixes this issue.
>> >
>> > Signed-off-by: Vivek Natarajan <vnatarajan@xxxxxxxxxxx>
>> > ---
>> >  net/mac80211/ieee80211_i.h |    1 +
>> >  net/mac80211/mlme.c        |    8 ++++++--
>> >  net/mac80211/tx.c          |    8 ++++++++
>> >  3 files changed, 15 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
>> > index 533fd32..6ad97f6 100644
>> > --- a/net/mac80211/ieee80211_i.h
>> > +++ b/net/mac80211/ieee80211_i.h
>> > @@ -346,6 +346,7 @@ enum ieee80211_sta_flags {
>> >     IEEE80211_STA_UAPSD_ENABLED     = BIT(7),
>> >     IEEE80211_STA_NULLFUNC_ACKED    = BIT(8),
>> >     IEEE80211_STA_RESET_SIGNAL_AVE  = BIT(9),
>> > +   IEEE80211_STA_PS_PENDING        = BIT(10),
>> >  };
>> >
>> >  struct ieee80211_if_managed {
>> > diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
>> > index e059b3a..45f736e 100644
>> > --- a/net/mac80211/mlme.c
>> > +++ b/net/mac80211/mlme.c
>> > @@ -727,13 +727,17 @@ void ieee80211_dynamic_ps_enable_work(struct work_struct *work)
>> >             return;
>> >
>> >     if ((local->hw.flags & IEEE80211_HW_PS_NULLFUNC_STACK) &&
>> > -       (!(ifmgd->flags & IEEE80211_STA_NULLFUNC_ACKED)))
>> > +       (!(ifmgd->flags & IEEE80211_STA_NULLFUNC_ACKED))) {
>> > +           ifmgd->flags |= IEEE80211_STA_PS_PENDING;
>> >             ieee80211_send_nullfunc(local, sdata, 1);
>> > +   }
>> >
>> >     if (!((local->hw.flags & IEEE80211_HW_REPORTS_TX_ACK_STATUS) &&
>> >           (local->hw.flags & IEEE80211_HW_PS_NULLFUNC_STACK)) ||
>> > -       (ifmgd->flags & IEEE80211_STA_NULLFUNC_ACKED)) {
>> > +       ((ifmgd->flags & IEEE80211_STA_NULLFUNC_ACKED) &&
>> > +         ifmgd->flags & IEEE80211_STA_PS_PENDING))  {
>> >             ifmgd->flags &= ~IEEE80211_STA_NULLFUNC_ACKED;
>> > +           ifmgd->flags &= ~IEEE80211_STA_PS_PENDING;
>> >             local->hw.conf.flags |= IEEE80211_CONF_PS;
>> >             ieee80211_hw_config(local, IEEE80211_CONF_CHANGE_PS);
>> >     }
>> > diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
>> > index 8fbbc7a..e1c2256 100644
>> > --- a/net/mac80211/tx.c
>> > +++ b/net/mac80211/tx.c
>> > @@ -185,6 +185,7 @@ ieee80211_tx_h_dynamic_ps(struct ieee80211_tx_data *tx)
>> >  {
>> >     struct ieee80211_local *local = tx->local;
>> >     struct ieee80211_if_managed *ifmgd;
>> > +   struct ieee80211_hdr *hdr;
>> >
>> >     /* driver doesn't support power save */
>> >     if (!(local->hw.flags & IEEE80211_HW_SUPPORTS_PS))
>> > @@ -233,6 +234,13 @@ ieee80211_tx_h_dynamic_ps(struct ieee80211_tx_data *tx)
>> >         && skb_get_queue_mapping(tx->skb) == 0)
>> >             return TX_CONTINUE;
>> >
>> > +   hdr = (struct ieee80211_hdr *)tx->skb->data;
>> > +
>> > +   if (!(ieee80211_is_nullfunc(hdr->frame_control) &&
>> > +        ieee80211_has_pm(hdr->frame_control)) &&
>> > +       (ifmgd->flags & IEEE80211_STA_PS_PENDING))
>> > +           ifmgd->flags &= ~IEEE80211_STA_PS_PENDING;
>> > +
>> >     if (local->hw.conf.flags & IEEE80211_CONF_PS) {
>>
>
> Maybe the subif queues should be stopped, then flush, then tx nullfunc,
> then stop all queues to configure the HW or something like that?

I tried this sequence:
the subif queues stopped, then flush, then tx nullfunc, and wake subif
queues,(we cannot have the queues stopped till we receive tx_status
because nullfunc might have failed during tx path itself and mac80211
will not receive tx_status)
After some time interval, once again stop queues on receiving ack for
nullfunc, configure the hw and then wake up queues. So, during the
above time interval, there is a race of queuing a frame to the hw. I
have tested this and the issue is quickly reproducible.

> Indeed, but the trace still exists between checking PS_PENDING and
> setting CONF_PS.

But this race(in microseconds?) did not happen in my testing for 10 hrs.

For tx path and the mlme PS path to be atomic, a new spinlock is
needed. So, to fix this, I can think of only a lock to protect the
checking of PS_PENDING and setting CONF_PS in mlme.c and lock in tx
path where we check the PS_PENDING.

Thanks
Vivek.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]
  Powered by Linux