Re: Strange spi_sync behaviour

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So it seems as if this solves the issue:

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 6ca59406b0b7..9862fd8fb00b 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1416,7 +1416,12 @@ void spi_finalize_current_message(struct spi_controller *ctlr)
	spin_lock_irqsave(&ctlr->queue_lock, flags);
	ctlr->cur_msg = NULL;
	ctlr->cur_msg_prepared = false;
-	kthread_queue_work(&ctlr->kworker, &ctlr->pump_messages);
+	/* only wake queue-kthread if there is anything in the queue
+	 * this way we avoid waking the kthread when using message_pump
+	 * directly in spi_sync without the kernel thread.
+	 */
+	if (!list_empty(&ctlr->queue))
+		kthread_queue_work(&ctlr->kworker, &ctlr->pump_messages);
	spin_unlock_irqrestore(&ctlr->queue_lock, flags);

	trace_spi_message_done(mesg);


At least my test case does not show any issues with this patch and the 
wakeup of the spi0 kthread (and corresponding context switches) does not 
happen any longer.

Still I wonder if this patch does create a potential race condition.

But then the lock ctlr->queue_lock is held and only then we are 
allowed to add entries to ctrl->queue, so there should be nothing
wrong.

If this looks fine then I can prepare a patch.

Thanks,
	Martin

> On 29.11.2018, at 17:39, kernel@xxxxxxxxxxxxxxxx wrote:
> 
> Hi!
> 
> I have come across a strange behavior on 4.20.0-rc3 with 
> regards to context switching which I want to present here:
> 
> The basic code is:
> void load_test(struct spi_device *spi)
> {
> 	u8 data[] = { 0x20, 0x00 };
> 	int i;
> 	struct spi_transfer xfer;
> 
> 	memset(&xfer, 0, sizeof(xfer));
> 	xfer.tx_buf = data;
> 	xfer.len = 2;
> 	xfer.speed_hz = 12500000;
> 
> 	dev_err(&spi->dev, "START-SPI-torture");
> 
> 	for(i = 0; i < 1000000; i++)
> 		spi_sync_transfer(spi, &xfer, 1);
> 
> 	dev_err(&spi->dev, "END-SPI-torture");
> }
> 
> The device on which I am testing is a bcm2837 (raspberry pi 3)
> (and it is - for ease of coding - called during network open).
> 
> The spi-bcm2835 driver in this case is using polling as the 
> expected transfer time is < 30us.
> 
> Also the SPI-bus is ONLY used by this specific device.
> 
> So what I would be expecting is that there is one
> process/thread that is running at 100% (as the inline mode
> of spi_async is used - that is __spi_queued_transfer - see: 
> https://elixir.bootlin.com/linux/v4.20-rc3/source/drivers/spi/spi.c#L3184)
> This would shown as part of the CPU utilization of the calling
> context - so in my case “ip ...” as I have called it from network open
> in the can network driver and and should be visible as such in top.
> 
> This is what _is_ happening - so far so good.
> 
> Statistic counters also show expected behavior:
> root@raspcm3:~# head /sys/class/spi_master/spi0/statistics/spi_*sync* ; echo
> ==> /sys/class/spi_master/spi0/statistics/spi_async <==
> 0
> ==> /sys/class/spi_master/spi0/statistics/spi_sync <==
> 30450931
> ==> /sys/class/spi_master/spi0/statistics/spi_sync_immediate <==
> 30450931
> 
> So everything is running immediately and not delegated to spi_pump thread (spi0).
> 
> But I also see 20-25% utilization on the spi0 kernel thread with top:
> top - 16:33:23 up  3:08,  3 users,  load average: 0.75, 0.18, 0.06
> Tasks:  78 total,   3 running,  40 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  0.2 us, 30.4 sy,  0.0 ni, 69.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> KiB Mem :   938496 total,   644680 free,    75980 used,   217836 buff/cache
> KiB Swap:        0 total,        0 free,        0 used.   793000 avail Mem
> 
>  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
> 2438 root      20   0    2260    432    356 R 100.0  0.0   0:22.77 ip
>  193 root      20   0       0      0      0 R  31.0  0.0   3:39.03 spi0
> 1991 root      20   0    8212   3276   2736 R   1.0  0.3   0:19.97 top
> 
> Looking at vmstat 1 on the console and dumping kernel messages to the
> console (so that I get both together) I see the following output:
> procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
> r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
> 0  0      0 644612  53564 163872    0    0     0     0 8081   65  0  0 100  0 0
> 0  0      0 644612  53572 163864    0    0     0    32 8105  102  0  0 100  0 0
> 0  0      0 644612  53572 163872    0    0     0     0 8043   34  0  0 100  0 0
> 0  0      0 644344  53572 163868    0    0     0     0 8249  272  1  2 97  0  0
> 0  0      0 644596  53572 163868    0    0     0     0 8133  111  0  1 99  0  0
> 0  0      0 644596  53572 163868    0    0     0     0 8047   28  0  0 100  0 0
> 0  0      0 644628  53572 163868    0    0     0    12 8115  112  1  0 99  0  0
> 0  0      0 644644  53576 163864    0    0     0     4 8088   91  0  0 100  0 0
> 0  0      0 644644  53576 163864    0    0     0     0 8064   64  0  0 100  0 0
> 0  0      0 644692  53576 163864    0    0     0     0 8124  114  0  1 99  0  0
> 0  0      0 644740  53576 163868    0    0     0     0 8045   36  0  0 100  0 0
> 0  0      0 644740  53576 163868    0    0     0     0 8052   45  0  0 100  0 0
> 0  0      0 644740  53576 163868    0    0     0     0 8089   72  1  0 99  0  0
> 0  0      0 644740  53576 163868    0    0     0     0 8045   32  0  0 100  0 0
> 0  0      0 644740  53576 163868    0    0     0     0 8051   36  0  0 100  0 0
> 0  0      0 644740  53576 163868    0    0     0     0 8063   46  0  1 99  0  0
> 0  0      0 644740  53576 163868    0    0     0     0 8043   29  0  0 100  0 0
> [ 9093.932618] mcp25xxfd spi0.0: MCP2517 successfully initialized.
> 0  0      0 644152  53576 163868    0    0     0     0 8235 14808  3 16 81  0 0
> [ 9094.981001] mcp25xxfd spi0.0 can0: OPEN
> [ 9094.998022] mcp25xxfd spi0.0: START-SPI-torture
> 1  0      0 644600  53576 163884    0    0     0     0 7883  107  0  1 98  0  0
> 2  0      0 644696  53576 163884    0    0     0     0 47578 78725  0 34 66  0 0
> 2  0      0 644696  53576 163884    0    0     0     0 46590 76769  0 30 70  0 0
> procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
> r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
> 2  0      0 644696  53576 163884    0    0     0     0 46629 76931  1 29 71  0 0
> 3  0      0 644664  53576 163884    0    0     0     0 46558 76736  0 34 66  0 0
> 2  0      0 644664  53584 163876    0    0     0    20 47575 78817  0 32 68  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 46708 77027  1 34 66  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 47582 78782  0 31 69  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 47548 78791  0 30 70  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 46003 75560  0 33 67  0 0
> 2  0      0 644696  53584 163884    0    0     0     0 47562 78782  0 34 66  0 0
> 2  0      0 644696  53584 163884    0    0     0     0 47584 78775  0 35 65  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 46767 77154  1 36 64  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 47568 78770  0 30 70  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 32955 70157  0 31 69  0 0
> 1  0      0 644664  53584 163884    0    0     0     0 36798 73959  0 30 70  0 0
> 1  0      0 644664  53584 163884    0    0     0     0 48030 79721  0 31 69  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 48008 79664  0 31 69  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 47294 78135  0 32 68  0 0
> 1  0      0 644664  53584 163884    0    0     0     0 47225 78099  0 32 68  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 47206 78095  0 30 70  0 0
> 1  0      0 644664  53584 163884    0    0     0     0 47237 78130  0 31 69  0 0
> 2  0      0 644664  53584 163884    0    0     0     0 47223 78116  0 31 69  0 0
> 1  0      0 644664  53584 163884    0    0     0     0 47224 78121  0 31 69  0 0
> procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
> r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
> 1  0      0 644664  53584 163884    0    0     0     0 47267 78153  1 32 68  0 0
> 1  0      0 644664  53584 163884    0    0     0     0 47268 78122  0 31 69  0 0
> [ 9120.716164] mcp25xxfd spi0.0: END-SPI-torture
> 1  0      0 644664  53584 163884    0    0     0     0 39618 62893  0 30 70  0 0
> [ 9121.241353] IPv6: ADDRCONF(NETDEV_CHANGE): can0: link becomes ready
> 0  0      0 644648  53584 163884    0    0     0     0 11477 6731  1  7 93  0 0
> 0  0      0 644372  53584 163904    0    0     0     0 8143   97  0  1 99  0  0
> 0  0      0 644420  53584 163904    0    0     0     0 8043   31  0  0 100  0 0
> 0  0      0 644420  53584 163904    0    0     0    12 8093  100  0  0 99  0  0
> 0  0      0 644388  53584 163904    0    0     0     0 8044   31  0  0 100  0 0
> 0  0      0 644404  53592 163896    0    0     0    20 8064   67  0  0 100  0 0
> 0  0      0 644404  53592 163904    0    0     0     0 8094   79  1  1 99  0  0
> 0  0      0 644436  53592 163904    0    0     0     0 8040   28  0  0 100  0 0
> 
> So you see that the number of interrupts and context switches increase as well
> and idle cpu drops to 66-70% on a 4 core system.
> 66% idle on a 4 core system is about 1.36 CPU cores (=4 Cores*(100%-66%)) used
> which also reflects the observations in top.
> 
> The big question now is: why is the spi0 message-pump thread woken when 
> executing spi_sync calls without deferring to that thread in the first place?
> 
> Any ideas?
> 
> Thanks,
> 	Martin
> 
> 
> 




[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux