Hi Amitkumar, On Thu, Oct 23, 2014 at 2:40 PM, Amitkumar Karwar <akarwar@xxxxxxxxxxx> wrote: > Hi Marek, > >>> I tried to capture logs but when enable DYNAMIC_DEBUG I cannot >>> reproduce issue (running test > 30 minutes without allocation >>> failure). > > Thanks for the testing. Yes. Sometimes timing issues won't get reproduced with debug messages enabled. > >>Any update on this? Should I provide some other logs? > > What's the size of Rx data packets? Is the Rx data AMSDU aggregated?(You can check if "if (rx_pkt_type == PKT_TYPE_AMSDU)" check is passed in mwifiex code) If so, disable AMSDU option in AP and try to reproduce the issue. Size of Rx data packets are : pkt type == 2, size - 1574bytes + BAR pkt type, size - 34bytes. AMSDU isn't enabled on AP (verified by adding debug message in mwifiex_process_sta_rx_packet()). > > As you suspected earlier, we might have missed to free skbs allocated for Rx data which leads to SKB allocation failure. There is very less probability for this. But we can try below experiment. > > 1) I observed that debug variable "adapter->rx_pending" doesn;t get decremented when packet is submitted to kernel. Add below code change(valid only for AMSDU disabled case. Because multiple packets are submitted to kernel when AMSDU is enabled) > > ---------- > --- a/drivers/net/wireless/mwifiex/util.c > +++ b/drivers/net/wireless/mwifiex/util.c > @@ -218,6 +218,7 @@ int mwifiex_recv_packet(struct mwifiex_private *priv, struct sk_buff *skb) > > priv->stats.rx_bytes += skb->len; > priv->stats.rx_packets++; > + atomic_dec(&priv->adapter->rx_pending); > if (in_interrupt()) > netif_rx(skb); > ---------- OK, patch applied. > > 2) Add BUG_ON when first time SKB allocation is failed and print "rx_pending". If it's a huge number, we have missed freeing allocated SKB. [ 167.624452] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed, rx_pending:26893 [ 167.632885] ------------[ cut here ]------------ [ 167.637743] Kernel BUG at bf8a22ae [verbose debug info unavailable] so number seems to be huge and we seems miss free allocated skb. I did some hacks to code which shows how many packets are received between 2 BAR packets + I print every 500ms rx_pending packets (when packet is received) and also when packet is send to kernel. I also update counter how many packets after reordering are sent to kernel. Log: [ 71.973800] usb 1-1: rx_pending:11 [ 72.077308] usb 1-1: rx_pending kernel:10 [ 72.477546] usb 1-1: rx_pending:868 [ 72.587877] usb 1-1: rx_pending kernel:1096 [ 72.818041] usb 1-1: Received between 2 BAR:6275 [ 72.823127] usb 1-1: Networking send size:6271 [ 72.987375] usb 1-1: rx_pending:1940 [ 73.097504] usb 1-1: rx_pending kernel:2159 [ 73.431973] usb 1-1: Received between 2 BAR:1602 [ 73.437106] usb 1-1: Networking send size:1608 [ 73.497381] usb 1-1: rx_pending:2983 [ 73.608315] usb 1-1: rx_pending kernel:3091 [ 74.007379] usb 1-1: rx_pending:3767 [ 74.117879] usb 1-1: rx_pending kernel:3998 [ 74.517375] usb 1-1: rx_pending:4854 [ 74.543168] usb 1-1: Received between 2 BAR:3152 [ 74.548371] usb 1-1: Networking send size:3152 [ 74.627509] usb 1-1: rx_pending kernel:5062 [ 74.743362] usb 1-1: Received between 2 BAR:495 [ 74.748483] usb 1-1: Networking send size:494 [ 75.027523] usb 1-1: rx_pending:5872 [ 75.137961] usb 1-1: rx_pending kernel:6106 [ 75.537485] usb 1-1: rx_pending:6959 [ 75.647934] usb 1-1: rx_pending kernel:7188 [ 75.656273] usb 1-1: Received between 2 BAR:2383 [ 75.661528] usb 1-1: Networking send size:2382 [ 76.047441] usb 1-1: rx_pending:8004 [ 76.157712] usb 1-1: rx_pending kernel:8240 [ 76.557547] usb 1-1: rx_pending:9095 [ 76.667991] usb 1-1: rx_pending kernel:9326 [ 76.769662] usb 1-1: Received between 2 BAR:2918 [ 76.775047] usb 1-1: Networking send size:2914 [ 77.067491] usb 1-1: rx_pending:10155 [ 77.177524] usb 1-1: rx_pending kernel:10383 [ 77.577547] usb 1-1: rx_pending:11237 [ 77.687859] usb 1-1: rx_pending kernel:11461 [ 78.087401] usb 1-1: rx_pending:12304 [ 78.197992] usb 1-1: rx_pending kernel:12539 [ 78.597442] usb 1-1: rx_pending:13391 [ 78.707786] usb 1-1: rx_pending kernel:13628 [ 79.107487] usb 1-1: rx_pending:14469 [ 79.217812] usb 1-1: rx_pending kernel:14704 [ 79.617528] usb 1-1: rx_pending:15555 [ 79.727734] usb 1-1: rx_pending kernel:15781 [ 80.127486] usb 1-1: rx_pending:16624 [ 80.237756] usb 1-1: rx_pending kernel:16855 [ 80.637423] usb 1-1: rx_pending:17699 [ 80.747868] usb 1-1: rx_pending kernel:17926 [ 81.147446] usb 1-1: rx_pending:18769 [ 81.257800] usb 1-1: rx_pending kernel:18995 [ 81.657399] usb 1-1: rx_pending:19851 [ 81.767785] usb 1-1: rx_pending kernel:20083 [ 82.167341] usb 1-1: rx_pending:20938 [ 82.278005] usb 1-1: rx_pending kernel:21174 [ 82.677527] usb 1-1: rx_pending:22025 [ 82.787508] usb 1-1: rx_pending kernel:22246 [ 83.187390] usb 1-1: rx_pending:23103 [ 83.297690] usb 1-1: rx_pending kernel:23326 [ 83.697376] usb 1-1: rx_pending:24182 [ 83.807711] usb 1-1: rx_pending kernel:24404 [ 84.207487] usb 1-1: rx_pending:25236 [ 84.317747] usb 1-1: rx_pending kernel:25466 [ 84.717378] usb 1-1: rx_pending:26308 [ 84.827806] usb 1-1: rx_pending kernel:26522 [ 85.070931] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27017 [ 85.079788] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27019 [ 85.088424] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27020 [ 85.097029] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27020 [ 85.105845] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27022 [ 85.114468] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27023 [ 85.337391] usb 1-1: rx_pending kernel:22327 [ 85.400153] usb 1-1: Received between 2 BAR:27875 [ 85.405135] usb 1-1: Networking send size:27881 [ 85.766194] usb 1-1: Received between 2 BAR:7781 [ 85.771152] usb 1-1: Networking send size:7781 [ 85.847374] usb 1-1: rx_pending kernel:11681 [ 86.112809] usb 1-1: Received between 2 BAR:7263 [ 86.117755] usb 1-1: Networking send size:7262 [ 86.357378] usb 1-1: rx_pending kernel:827 According log it seems rx_pending is slowly increasing until allocation fails. Code hacks are attached. > > 3) Also, get rx_pending info when Rx traffic works fine with 40M bandwidth option. when -b40 is set then IMO 11n reordering isn't applied. I added some debug code to mwifiex_11n_rx_reorder_pkt() .. snip .. tbl = mwifiex_11n_get_rx_reorder_tbl(priv, tid, ta); if (!tbl) { .. snip .. is always true (!tbl) so rx_pending after session is 0. > > Btw, could you move to the firmware image (14.68.29.p38) > shared recently? Tested with 14.68.29.p38 FW version. Issue still reproducible. > > By any means(redirecting the log to serial console etc.), could you please capture and share kernel trace logs when system becomes unresponsive. It may give some clue and help us debug further. As I wrote earlier with DYNAMIC_DEBUG I cannot reproduce issue. Maybe when we drop some debug code (received packet over USB) issue can be reproducible again, but it's just guess. > > Best Regards, > Amit BR, marek -- as simple and primitive as possible ------------------------------------------------- Marek Belisko - OPEN-NANDRA Freelance Developer Ruska Nova Ves 219 | Presov, 08005 Slovak Republic Tel: +421 915 052 184 skype: marekwhite twitter: #opennandra web: http://open-nandra.com
diff --git a/drivers/net/wireless/mwifiex/11n_rxreorder.c b/drivers/net/wireless/mwifiex/11n_rxreorder.c index 5e796f8..f6a8d0d 100644 --- a/drivers/net/wireless/mwifiex/11n_rxreorder.c +++ b/drivers/net/wireless/mwifiex/11n_rxreorder.c @@ -26,6 +26,7 @@ #include "11n.h" #include "11n_rxreorder.h" +static int pkt_send = 0; /* * This function dispatches all packets in the Rx reorder table until the * start window. @@ -36,7 +37,7 @@ */ static void mwifiex_11n_dispatch_pkt(struct mwifiex_private *priv, - struct mwifiex_rx_reorder_tbl *tbl, int start_win) + struct mwifiex_rx_reorder_tbl *tbl, int start_win, int *send_size) { int pkt_to_send, i; void *rx_tmp_ptr; @@ -45,6 +46,7 @@ mwifiex_11n_dispatch_pkt(struct mwifiex_private *priv, pkt_to_send = (start_win > tbl->start_win) ? min((start_win - tbl->start_win), tbl->win_size) : tbl->win_size; +// pr_info("%s - pkt_to_send:%d\n", __func__, pkt_to_send); for (i = 0; i < pkt_to_send; ++i) { spin_lock_irqsave(&priv->rx_pkt_lock, flags); @@ -57,8 +59,11 @@ mwifiex_11n_dispatch_pkt(struct mwifiex_private *priv, if (rx_tmp_ptr) { if (priv->bss_role == MWIFIEX_BSS_ROLE_UAP) mwifiex_handle_uap_rx_forward(priv, rx_tmp_ptr); - else + else { mwifiex_process_rx_packet(priv, rx_tmp_ptr); + if (send_size) + *send_size += 1; + } } } @@ -86,7 +91,7 @@ mwifiex_11n_dispatch_pkt(struct mwifiex_private *priv, */ static void mwifiex_11n_scan_and_dispatch(struct mwifiex_private *priv, - struct mwifiex_rx_reorder_tbl *tbl) + struct mwifiex_rx_reorder_tbl *tbl, int *send_size) { int i, j, xchg; void *rx_tmp_ptr; @@ -104,8 +109,10 @@ mwifiex_11n_scan_and_dispatch(struct mwifiex_private *priv, if (priv->bss_role == MWIFIEX_BSS_ROLE_UAP) mwifiex_handle_uap_rx_forward(priv, rx_tmp_ptr); - else + else { mwifiex_process_rx_packet(priv, rx_tmp_ptr); + *send_size += 1; + } } spin_lock_irqsave(&priv->rx_pkt_lock, flags); @@ -140,7 +147,7 @@ mwifiex_del_rx_reorder_entry(struct mwifiex_private *priv, return; mwifiex_11n_dispatch_pkt(priv, tbl, (tbl->start_win + tbl->win_size) & - (MAX_TID_VALUE - 1)); + (MAX_TID_VALUE - 1), NULL); del_timer(&tbl->timer_context.timer); @@ -238,7 +245,7 @@ mwifiex_flush_data(unsigned long context) dev_dbg(ctx->priv->adapter->dev, "info: flush data %d\n", start_win); mwifiex_11n_dispatch_pkt(ctx->priv, ctx->ptr, (ctx->ptr->start_win + start_win + 1) & - (MAX_TID_VALUE - 1)); + (MAX_TID_VALUE - 1), NULL); } /* @@ -267,7 +274,7 @@ mwifiex_11n_create_rx_reorder_tbl(struct mwifiex_private *priv, u8 *ta, */ tbl = mwifiex_11n_get_rx_reorder_tbl(priv, tid, ta); if (tbl) { - mwifiex_11n_dispatch_pkt(priv, tbl, seq_num); + mwifiex_11n_dispatch_pkt(priv, tbl, seq_num, NULL); return; } /* if !tbl then create one */ @@ -434,11 +441,15 @@ int mwifiex_11n_rx_reorder_pkt(struct mwifiex_private *priv, tbl = mwifiex_11n_get_rx_reorder_tbl(priv, tid, ta); if (!tbl) { + pr_info("!tbl\n"); if (pkt_type != PKT_TYPE_BAR) { if (priv->bss_role == MWIFIEX_BSS_ROLE_UAP) mwifiex_handle_uap_rx_forward(priv, payload); else mwifiex_process_rx_packet(priv, payload); + } else { + // missing free (maybe also rx_pending) + dev_kfree_skb_any(payload); } return 0; } @@ -485,7 +496,7 @@ int mwifiex_11n_rx_reorder_pkt(struct mwifiex_private *priv, start_win = (end_win - win_size) + 1; else start_win = (MAX_TID_VALUE - (win_size - seq_num)) + 1; - mwifiex_11n_dispatch_pkt(priv, tbl, start_win); + mwifiex_11n_dispatch_pkt(priv, tbl, start_win, &pkt_send); } if (pkt_type != PKT_TYPE_BAR) { @@ -498,13 +509,16 @@ int mwifiex_11n_rx_reorder_pkt(struct mwifiex_private *priv, return -1; tbl->rx_reorder_ptr[pkt_index] = payload; + } else if (pkt_type == PKT_TYPE_BAR) { + dev_info(priv->adapter->dev, "Networking send size:%d\n", pkt_send); + pkt_send = 0; } /* * Dispatch all packets sequentially from start_win until a * hole is found and adjust the start_win appropriately */ - mwifiex_11n_scan_and_dispatch(priv, tbl); + mwifiex_11n_scan_and_dispatch(priv, tbl, &pkt_send); return 0; } diff --git a/drivers/net/wireless/mwifiex/sta_rx.c b/drivers/net/wireless/mwifiex/sta_rx.c index b5c1095..bc1c0d7 100644 --- a/drivers/net/wireless/mwifiex/sta_rx.c +++ b/drivers/net/wireless/mwifiex/sta_rx.c @@ -122,6 +122,7 @@ int mwifiex_process_sta_rx_packet(struct mwifiex_private *priv, struct rx_packet_hdr *rx_pkt_hdr; u8 ta[ETH_ALEN]; u16 rx_pkt_type, rx_pkt_offset, rx_pkt_length, seq_num; + static int pkts_received = 0; local_rx_pd = (struct rxpd *) (skb->data); rx_pkt_type = le16_to_cpu(local_rx_pd->rx_pkt_type); @@ -146,6 +147,7 @@ int mwifiex_process_sta_rx_packet(struct mwifiex_private *priv, } if (rx_pkt_type == PKT_TYPE_AMSDU) { + pr_info("AMSDU\n"); struct sk_buff_head list; struct sk_buff *rx_skb; @@ -168,6 +170,7 @@ int mwifiex_process_sta_rx_packet(struct mwifiex_private *priv, ret = mwifiex_process_mgmt_packet(priv, skb); if (ret) dev_err(adapter->dev, "Rx of mgmt packet failed"); + atomic_dec(&priv->adapter->rx_pending); dev_kfree_skb_any(skb); return ret; } @@ -191,10 +194,19 @@ int mwifiex_process_sta_rx_packet(struct mwifiex_private *priv, ETH_ALEN); } + if (rx_pkt_type != PKT_TYPE_BAR && skb->len > 1500) { + pkts_received++; + } + + if (rx_pkt_type == PKT_TYPE_BAR) { + dev_info(priv->adapter->dev, "Received between 2 BAR:%d\n", pkts_received); + pkts_received = 0; + } /* Reorder and send to OS */ ret = mwifiex_11n_rx_reorder_pkt(priv, seq_num, local_rx_pd->priority, ta, (u8) rx_pkt_type, skb); + if (ret || (rx_pkt_type == PKT_TYPE_BAR)) { if (adapter->if_ops.data_complete) adapter->if_ops.data_complete(adapter, skb); diff --git a/drivers/net/wireless/mwifiex/usb.c b/drivers/net/wireless/mwifiex/usb.c index 7d9d10b..959daed 100644 --- a/drivers/net/wireless/mwifiex/usb.c +++ b/drivers/net/wireless/mwifiex/usb.c @@ -38,6 +38,7 @@ static struct usb_device_id mwifiex_usb_table[] = { MODULE_DEVICE_TABLE(usb, mwifiex_usb_table); +unsigned long timeout = 0; /* Indicate if load FW in MFG (testing) mode */ static int mfg_mode = 0; @@ -178,8 +179,14 @@ static void mwifiex_usb_rx_complete(struct urb *urb) atomic_inc(&adapter->rx_pending); status = mwifiex_usb_recv(adapter, skb, context->ep); + if (time_after(jiffies, timeout)) { + dev_info(adapter->dev, "rx_pending:%d\n", atomic_read(&adapter->rx_pending)); + timeout = jiffies + msecs_to_jiffies(500); + } + dev_dbg(adapter->dev, "info: recv_length=%d, status=%d\n", recv_length, status); + if (status == -EINPROGRESS) { queue_work(adapter->workqueue, &adapter->main_work); @@ -259,7 +266,7 @@ static int mwifiex_usb_submit_rx_urb(struct urb_context *ctx, int size) ctx->skb = dev_alloc_skb(size); if (!ctx->skb) { dev_err(adapter->dev, - "%s: dev_alloc_skb failed\n", __func__); + "%s: dev_alloc_skb failed - rx_pending:%d\n", __func__, atomic_read(&adapter->rx_pending)); return -ENOMEM; } } @@ -344,6 +351,7 @@ static int mwifiex_usb_probe(struct usb_interface *intf, bcd_usb = le16_to_cpu(udev->descriptor.bcdUSB); pr_debug("info: VID/PID = %X/%X, Boot2 version = %X\n", id_vendor, id_product, bcd_device); + timeout = jiffies + msecs_to_jiffies(1000); /* PID_1 is used for firmware downloading only */ if (id_product == USB8797_PID_1) diff --git a/drivers/net/wireless/mwifiex/util.c b/drivers/net/wireless/mwifiex/util.c index 2155397..b18ee13 100644 --- a/drivers/net/wireless/mwifiex/util.c +++ b/drivers/net/wireless/mwifiex/util.c @@ -25,6 +25,7 @@ #include "wmm.h" #include "11n.h" +unsigned long timeout = 0; /* * Firmware initialization complete callback handler. * @@ -195,6 +196,9 @@ int mwifiex_recv_packet(struct mwifiex_private *priv, struct sk_buff *skb) skb->protocol = eth_type_trans(skb, priv->netdev); skb->ip_summed = CHECKSUM_NONE; + if (timeout == 0) + timeout = jiffies; + /* This is required only in case of 11n and USB as we alloc * a buffer of 4K only if its 11N (to be able to receive 4K * AMSDU packets). In case of SD we allocate buffers based @@ -218,6 +222,12 @@ int mwifiex_recv_packet(struct mwifiex_private *priv, struct sk_buff *skb) priv->stats.rx_bytes += skb->len; priv->stats.rx_packets++; + atomic_dec(&priv->adapter->rx_pending); + if (time_after(jiffies, timeout)) { + dev_info(priv->adapter->dev, "rx_pending kernel:%d\n", atomic_read(&priv->adapter->rx_pending)); + timeout = jiffies + msecs_to_jiffies(500); + } + if (in_interrupt()) netif_rx(skb); else