Re: AM3517 DaVinci EMAC Ethernet performance issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mugunthan and all,

We have essentially proven our theory, but have yet to solve the problem.  Our theory was that the driver was essentially taking a few second nap during transmission, despite there being plenty more work to do.  We further theorized that our custom userspace application was somehow preventing the driver from taking this nap, and that's why our RX performance is perfect all the time while our TX is so spotty unless that application is running.

Over the last few days, we've went through several test patches and have essentially tracked the issue back to our emac_poll() function not being called.  By inserting the following patch snippet, and snippets like it, into various functions in the emac driver, we were able to prove that from time to time these functions sleep for several seconds, despite iperf certainly having given them something to do.  (Note: the patch has changed a bit since the output pasted below was captured, as we were looking for more detail, but the core functionality is the same.)

===================================================================================================
[[ ---- patch snippet ---- ]]
===================================================================================================

--- linux/drivers/net/ethernet/ti/davinci_emac.c    2013-04-08 16:20:19.202550042 -0400
+++ linux_emac_tst/drivers/net/ethernet/ti/davinci_emac.c    2013-04-08 18:45:37.000000000 -0400
@@ -64,6 +64,12 @@
 
 #include "davinci_cpdma.h"
 
+//// DEBUGGING ONLY!!!!
+#include <linux/jiffies.h>
+static unsigned long qstopped_jifs = 0;
+static unsigned long poll_func_jifs = 0;
+static u32 last_num_rx_pkts = 0, last_num_tx_pkts = 0;
+
 static int debug_level;
 module_param(debug_level, int, 0);
 MODULE_PARM_DESC(debug_level, "DaVinci EMAC debug level (NETIF_MSG bits)");
@@ -1055,7 +1061,19 @@
      * queue is stopped then start the queue as we have free desc for tx
      */
     if (unlikely(netif_queue_stopped(ndev)))
+    {
         netif_wake_queue(ndev);
+        if (qstopped_jifs > 0)
+        {
+            // we've been stopped, measure how long
+            if ( time_after_eq(jiffies, (qstopped_jifs + HZ/2)) )
+            {
+                // we've been stopped >= 1/2 second!!!
+                printk(KERN_ERR "===EMAC: TX queue has been stopped for > 1/2 second!!!\n");
+            }
+        }
+        qstopped_jifs = 0;
+    }    
     ndev->stats.tx_packets++;
     ndev->stats.tx_bytes += len;
     dev_kfree_skb_any(skb);
@@ -1105,7 +1123,11 @@
      * tell the kernel to stop sending us tx frames.
      */
     if (unlikely(!cpdma_check_free_tx_desc(priv->txchan)))
+    {
         netif_stop_queue(ndev);
+        // we just stopped, record the time
+        qstopped_jifs = jiffies;
+    }
 
     return NETDEV_TX_OK;
 
@@ -1374,6 +1396,15 @@
     u32 status = 0;
     u32 num_tx_pkts = 0, num_rx_pkts = 0;
 
+
+unsigned long tmp = jiffies;
+if ( time_after_eq(tmp, (poll_func_jifs + HZ/2)) )
+{
+    // we've been stopped >= 1/2 second!!!
+    printk(KERN_ERR "===EMAC: POLL FUNCTION, last RX/TX=[%d][%d]\n", last_num_rx_pkts, last_num_tx_pkts);
+}
+poll_func_jifs = tmp;
+
     /* Check interrupt vectors and call packet processing */
     status = emac_read(EMAC_MACINVECTOR);
 
@@ -1385,6 +1416,10 @@
     if (status & mask) {
         num_tx_pkts = cpdma_chan_process(priv->txchan,
                           EMAC_DEF_TX_MAX_SERVICE);
+        if (num_tx_pkts < 0)
+        {
+printk(KERN_ERR "=== ERR!!! num_tx_pkts=%d\n", num_tx_pkts);        
+        }
     } /* TX processing */
 
     mask = EMAC_DM644X_MAC_IN_VECTOR_RX_INT_VEC;
@@ -1394,8 +1429,17 @@
 
     if (status & mask) {
         num_rx_pkts = cpdma_chan_process(priv->rxchan, budget);
+        if (num_rx_pkts < 0)
+        {
+printk(KERN_ERR "=== ERR!!! num_rx_pkts=%d\n", num_rx_pkts);
+        }
     } /* RX processing */
 
+
+last_num_tx_pkts = num_tx_pkts;
+last_num_rx_pkts = num_rx_pkts;
+
+
     mask = EMAC_DM644X_MAC_IN_VECTOR_HOST_INT;
     if (priv->version == EMAC_VERSION_2)
         mask = EMAC_DM646X_MAC_IN_VECTOR_HOST_INT;

===================================================================================================
[[ ---- patch snippet ---- ]]
===================================================================================================


It appears from our debugging that somehow the emac_poll() function is not being called despite there being work to do.  As before however, when we start our custom application (which presently does very little from a system calls' perspective), our performance stabilizes greatly.  Please see below.

Has anyone else seen anything like this with the EMAC, or have any idea why or how this could be happening?  We initially thought it could just be a failure to properly reschedule the emac_wait(), but after playing with various items there for a while, we're convinced that's not it.  Now we're guessing that perhaps softirqs or something within the DMA world is to blame here, but we're certainly not experts in this area.  Again, we've passed in 'nohlt' and have worked to remove power management.  So we don't think that plays a role.  Likewise, we never see a packet error -- even in our Wireshark traces.  We just see the system take a split-second to multi-second break.

What specifically causes the emac_poll() and emac_tx_handler() functions to fire, and is there a way to debug that cleanly?

Can anyone imagine how a userspace application would be able to improve this behavior so much?

Again, thank you so very much for your time.
CF


Boot log:
===================================================================================================
Fresh boot with no network activity, we see this with our debugging in there.
===================================================================================================

[   22.815582] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[   22.822937] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[   23.817993] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[   23.825378] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[   24.817810] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[   24.825225] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[   30.793365] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[   30.800811] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[   31.427093] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[   31.434661] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[   32.426879] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[   32.434387] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!

root@test:~# iperf-i 5 -t 30000 -c 10.22.255.254
[   43.093078] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[   43.100463] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!

===================================================================================================
Starting iperf with 4x instances of <Custom App> stuff running.  See?  It's very happy for a long time.
===================================================================================================

------------------------------------------------------------
Client connecting to 10.22.255.254, TCP port 5001
TCP window size: 19.6 KByte (default)
------------------------------------------------------------
[  3] local 10.22.0.2 port 44102 connected with 10.22.255.254 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 5.0 sec  56.8 MBytes  95.3 Mbits/sec
[  3]  5.0-10.0 sec  56.0 MBytes  94.0 Mbits/sec
[[ ---- SNIP ---- ]]
[  3] 615.0-620.0 sec  56.4 MBytes  94.6 Mbits/sec
[  3] 620.0-625.0 sec  55.3 MBytes  92.9 Mbits/sec

===================================================================================================
Stopping all 4 <Custom App> instances from running...
===================================================================================================
         Stopping <Custom App>1...
[  OK  ] Stopped <Custom App>1.
         Stopping <Custom App>2...
[  OK  ] Stopped <Custom App>2.
         Stopping <Custom App>3...
[  OK  ] Stopped <Custom App>3.
[  3] 625.0-630.0 sec  55.1 MBytes  92.4 Mbits/sec
         Stopping <Custom App>4...
[  OK  ] Stopped <Custom App>4.

===================================================================================================
Instantaneously, we see the POLL (and therefore the TX handler) warnings getting called
and the performance is slipping and/or timing out all together.
===================================================================================================

[  681.469207] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[  681.476501] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[  3] 630.0-635.0 sec  8.59 MBytes  14.4 Mbits/sec
[  692.489929] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[  692.497222] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[  692.504272] ===EMAC: TX queue has been stopped for > 1/2 second!!!
[  3] 635.0-640.0 sec  11.4 MBytes  19.1 Mbits/sec
[  3] 640.0-645.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 645.0-650.0 sec  5.71 MBytes  9.58 Mbits/sec
[  3] 650.0-655.0 sec  29.6 MBytes  49.6 Mbits/sec
[  3] 655.0-660.0 sec  46.7 MBytes  78.3 Mbits/sec
[  714.523315] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[  714.530609] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[  3] 660.0-665.0 sec  42.3 MBytes  71.0 Mbits/sec
[  3] 665.0-670.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 670.0-675.0 sec  37.5 MBytes  62.9 Mbits/sec
[  3] 675.0-680.0 sec  53.1 MBytes  89.1 Mbits/sec
[  3] 680.0-685.0 sec  44.8 MBytes  75.2 Mbits/sec
[  3] 685.0-690.0 sec  49.2 MBytes  82.6 Mbits/sec
[  3] 690.0-695.0 sec  49.0 MBytes  82.2 Mbits/sec
[  747.616729] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[  747.624023] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[  3] 695.0-700.0 sec  40.2 MBytes  67.4 Mbits/sec
[  3] 700.0-705.0 sec  5.38 MBytes  9.03 Mbits/sec
[  3] 705.0-710.0 sec  51.8 MBytes  86.9 Mbits/sec
         Starting <Custom App>2...
[  OK  ] Started <Custom App>2.
[  3] 710.0-715.0 sec  51.9 MBytes  87.1 Mbits/sec
[  3] 715.0-720.0 sec  54.1 MBytes  90.8 Mbits/sec
[  3] 720.0-725.0 sec  53.6 MBytes  89.9 Mbits/sec
[  3] 725.0-730.0 sec  53.3 MBytes  89.4 Mbits/sec
[  3] 730.0-735.0 sec  53.2 MBytes  89.3 Mbits/sec
[  3] 735.0-740.0 sec  56.1 MBytes  94.1 Mbits/sec
[  3] 740.0-745.0 sec  52.0 MBytes  87.3 Mbits/sec
[  3] 745.0-750.0 sec  52.5 MBytes  88.1 Mbits/sec
[  3] 750.0-755.0 sec  55.9 MBytes  93.8 Mbits/sec
[  3] 755.0-760.0 sec  55.9 MBytes  93.9 Mbits/sec
[  3] 760.0-765.0 sec  53.3 MBytes  89.5 Mbits/sec
^C[  3]  0.0-765.9 sec  7.93 GBytes  88.9 Mbits/sec
===================================================================================================
Shut down iperf and drop back to the default idling
===================================================================================================
root@test:~# [  809.822906] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[  809.960937] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[  810.822845] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[  810.960937] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
[  813.143280] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!!
[  813.355712] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!!
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux