Search Linux Wireless

Re: Asus eeepc 1008HA suspend issue and mac80211 suspend corner case

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 22, 2009 at 08:54:20PM -0800, Sujith Manoharan wrote:
> Luis Rodriguez wrote:
> > Did you see the last one? It turns out its the only config option
> > passed by mac8021 I've seen so far so my commit log should change.
> 
> Right, that patch should work.
> So the one I sent out and this one fixes the issue ?

Actually the last one alone fixes the issue, likely because harware is not
failing on the routines during the stop() callback when stopping harware, that
is theese guys don't fail if in NETWORK_SLEEP it seems:

        ath9k_hw_disable(ah);
        ath9k_hw_configpcipowersave(ah, 1, 1);
        ath9k_setpower(sc, ATH9K_PM_FULL_SLEEP);

This is likely due to lack of proper documentation on our part of which
hardware blocks exactly cannot be accessed when the devices is in
NETWORK_SLEEP. But to be safe its better to just sprinkly the ps_awake
/resume calls where we do see harware being poked. If we find details
on this question though we can likely enhance power saving more
and do some operations even while in NETWORK_SLEEP.

Then as for the core of the issue -- it was not ath9k anyway, the last
patch I posted indeed fixes the issue but mac80211 drivers should not have
to worry about such issues.

As it turns out mac80211 is scheduling the dynamic_ps_disable_work work
after the driver stop() call has been called. This exposes two issues then
on mac80211. But to elaborate on those let me paste the some relevants
parts of pm.c __ieee80211_suspend():

int __ieee80211_suspend(struct ieee80211_hw *hw)
{
	...
        ieee80211_stop_queues_by_reason(hw,
                        IEEE80211_QUEUE_STOP_REASON_SUSPEND);

        /* flush out all packets */
        synchronize_net();

        local->quiescing = true;
        /* make quiescing visible to timers everywhere */
        mb();

        flush_workqueue(local->workqueue);
	...

        /* stop hardware - this must stop RX */
        if (local->open_count)
                ieee80211_stop_device(local);

        local->suspended = true;
        /* need suspended to be visible before quiescing is false */
        barrier();
        local->quiescing = false;

        return 0;
}

The issues are as follows:

1) We stop TX and flush all packets out, and then call the driver stop().
   Unfortunately there is a failed assumption that even ieee80211_xmit()
   would not be called after stopping TX as __ieee80211_suspend() does
   above.
2) Since ieee80211_xmit() is being called even after the driver stop()
   callback it means mac80211 can potentially schedule work. Now the
   new rework on the mac80211 workqueue pushes us to WARN when either
   a driver or mac80211 tried to queue work onto the mac80211 workqueue
   and we're suspended. A new patch from Johannes futher enhances this
   to take into consideration resume so that we can allow drivers / mac80211
   to queue work if we were suspended but now resuming. Even with these
   checks in place I note that currently we do slip work through after
   the driver stop() callback is called and before loacl->suspended is
   set to true. So there seems to be a race here.

The first issue is not so clear to resolve as although we likely do prevent
the networking core from calling ieee80211_subif_start_xmit() it doesn't
mean ieee80211_xmit() internally will not be called by other parts of
mac80211 and indeed I do believe this is what is happening. I'll try
pin point the exact spot where this happens though, but I'll note though
that checking for local->suspended will *not* work since we already know
some work is being queued *and running!* and it wouldn't have if
local->suspended was true already.

static bool ieee80211_can_queue_work(struct ieee80211_local *local)
{
        if (WARN(local->suspended && !local->resuming,
                 "queueing ieee80211 work while going to suspend\n"))
                return false;

        return true;
}

void ieee80211_queue_work(struct ieee80211_hw *hw, struct work_struct *work)
{
        struct ieee80211_local *local = hw_to_local(hw);

        if (!ieee80211_can_queue_work(local))
                return;

        queue_work(local->workqueue, work);
}
EXPORT_SYMBOL(ieee80211_queue_work);

The second issue can likely be fixed by fixing the first one.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]
  Powered by Linux