Re: [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID.

Andreas Fenkart <andreas.fenkart@xxxxxxxxxxxxxxxxxxx> · Wed, 3 Apr 2013 13:35:54 +0200

Hi Bing,

On Tue, Apr 02, 2013 at 07:40:53PM -0700, Bing Zhao wrote:
> 
> > Using NO_PKT_PRIO_TID and tx_pkts_queued to check for an empty state, can
> > lead to a contradictory state, resulting in an infinite loop.
> > Currently queueing and dequeuing of packets is not synchronized, and can
> > happen concurrently. While tx_pkts_queued is incremented when adding a
> > packet, max prio is set to NO_PKT when the WMM list is empty. If a packet
> > is added right after the check for empty, but before setting max prio to
> > NO_PKT, that packet is trapped and creates an infinite loop.
> > Because of the new packet, tx_pkts_queued is at least 1, indicating wmm
> > lists are not empty. Opposing that max prio is NO_PKT, which means "skip
> > this wmm queue, it has no packets". The infinite loop results, because the
> > main loop checks the wmm lists for not empty via tx_pkts_queued, but when
> > dequeing uses max_prio to see if it can skip a list. This will never end,
> > unless a new packet is added which will restore max prio to the level of
> > the trapped packet.
> > The solution here is to rely on tx_pkts_queued solely for checking wmm
> > queue to be empty, and drop the NO_PKT define. It does not address the
> > locking issue.
> > 
> > Signed-off-by: Andreas Fenkart <andreas.fenkart@xxxxxxxxxxxxxxxxxxx>
> 
> With this patch (1/6) applied, I'm getting soft-lockup watchdog:
> 
> BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:37]

My bad here, should be like this when patch is applied first:

@@ -919,8 +919,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct
mwifiex_adapter *adapter,
 
                do {
                        priv_tmp = bssprio_node->priv;
-                       hqp = &priv_tmp->wmm.highest_queued_prio;
 
+                       if (atomic_read(&priv_tmp->wmm.tx_pkts_queued) == 0)
+                               goto skip_bss;
+
+                       /* iterate over the WMM queues of the BSS */
+                       hqp = &priv_tmp->wmm.highest_queued_prio;
                        for (i = atomic_read(hqp); i >= LOW_PRIO_TID; --i) {
 
                                tid_ptr = &(priv_tmp)->wmm.
@@ -980,12 +984,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
                                } while (ptr != head);
                        }
 
-                       /* No packet at any TID for this priv. Mark as
                        such
-                        * to skip checking TIDs for this priv (until
                         pkt is
-                        * added).
-                        */
-                       atomic_set(hqp, NO_PKT_PRIO_TID);
-
+skip_bss:
                        /* Get next bss priority node */
                        bssprio_node = list_first_entry(&bssprio_node->list,
                                                struct mwifiex_bss_prio_node,

That said, yes I developed the pathset the other way round. First
cleaned up, until I knew how to fix the bug best. Then pulled the fix
in front of the cleanup patches and -- mea culpa -- didn't test the
patches individually. Sorry again.

Also found issue here, which could be a problem without patch 6/6:

--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -688,13 +688,13 @@ mwifiex_wmm_add_buf_txqueue(struct mwifiex_private *priv,
        ra_list->total_pkts_size += skb->len;
        ra_list->pkt_count++;
 
-       atomic_inc(&priv->wmm.tx_pkts_queued);
-
        if (atomic_read(&priv->wmm.highest_queued_prio) <
                                                tos_to_tid_inv[tid_down])
                atomic_set(&priv->wmm.highest_queued_prio,
                           tos_to_tid_inv[tid_down]);
 
+       atomic_inc(&priv->wmm.tx_pkts_queued);
+


How should I proceed? Can I reorder patches to match my development
cycle, which is? 2-5;1;6 or more verbosely cleanup first followed
by bug fix and proper locking last

Or should keep the order as is, but fix patch 1, and propagate changes
through patch 2 till 6?

rgds,
Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html