Re: Migrating ksoftirqd? (was: msleep_interruptible() sleeps *way* too long on PowerPC)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 3, 2012 at 6:08 PM, <vomlehn@xxxxxxxxx> wrote:
>
> If I am interpreting the ftrace output correctly, I seem to be having a
> problem with migrating ksoftirqd threads. I've attached a trace below
> but the following line illustrates the issue:
>
> #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> #              | |       |          |         |
> ...
>      ksoftirqd/1-15    [000]   413.997630: dmvl_debug0 <-run_ksoftirqd
>
> I would expect ksoftirqd/1 to remain on CPU 1. If migration is actually
> happening, I think explains the strange msleep_interruptible() behavior
> I'm seeing--ksoftirqd only runs the timer queue for the processor on which
> it's running, and no ksoftirqd is running on the processor on which the
> timer_list entry was added.
>
> [Note: dmvl_debug0() is a dummy function I inserted at the top of the loop
> in run_ksoftirqd so I could verify with ftrace that it actually got there]
>
> --------------------------------- cut here ------------------------------------
> # tracer: function
> #
> #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> #              | |       |          |         |
>      ksoftirqd/1-15    [001]   413.985590: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.985596: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.985603: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.985613: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.986585: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.986595: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.986600: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.986613: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.987581: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.987595: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.987596: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.987607: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.988581: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.988594: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.988596: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.988605: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.989581: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.989597: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.989601: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.989603: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.990581: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.990589: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.990594: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.990607: dmvl_debug0 <-run_ksoftirqd
>           rcuc/0-8     [000]   413.990630: __do_softirq <-local_bh_enable
>           rcuc/0-8     [000]   413.990642: __do_softirq <-local_bh_enable
>      ksoftirqd/0-3     [000]   413.990650: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.991580: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.991586: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.991592: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.991598: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.992579: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.992593: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.992597: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.992599: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.993579: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.993586: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.993591: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.993604: dmvl_debug0 <-run_ksoftirqd
>           rcuc/0-8     [000]   413.993623: __do_softirq <-local_bh_enable
>           rcuc/0-8     [000]   413.993635: __do_softirq <-local_bh_enable
>           rcuc/0-8     [000]   413.993644: __do_softirq <-local_bh_enable
>      ksoftirqd/0-3     [000]   413.993652: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.994584: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.994590: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.994593: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.994603: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.995583: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.995589: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.995590: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.995600: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.996582: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.996590: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.996592: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.996598: dmvl_debug0 <-run_ksoftirqd
>           rcuc/0-8     [000]   413.996616: __do_softirq <-local_bh_enable
>           rcuc/0-8     [000]   413.996629: __do_softirq <-local_bh_enable
>           rcuc/0-8     [000]   413.996639: __do_softirq <-local_bh_enable
>           rcuc/0-8     [000]   413.996648: __do_softirq <-local_bh_enable
>           rcuc/0-8     [000]   413.996658: __do_softirq <-local_bh_enable
>           rcuc/0-8     [000]   413.996667: __do_softirq <-local_bh_enable
>      ksoftirqd/0-3     [000]   413.996675: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.997584: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.997592: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.997601: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.997603: dmvl_debug0 <-run_ksoftirqd
>           rcuc/1-14    [001]   413.997627: __do_softirq <-local_bh_enable
>      ksoftirqd/1-15    [000]   413.997630: dmvl_debug0 <-run_ksoftirqd
>           rcuc/1-14    [001]   413.997643: __do_softirq <-local_bh_enable
>      ksoftirqd/1-15    [000]   413.997644: dmvl_debug0 <-run_ksoftirqd
>           rcuc/1-14    [001]   413.997655: __do_softirq <-local_bh_enable
>      ksoftirqd/1-15    [000]   413.997657: dmvl_debug0 <-run_ksoftirqd
>           rcuc/1-14    [001]   413.997668: __do_softirq <-local_bh_enable
>      ksoftirqd/1-15    [000]   413.997670: dmvl_debug0 <-run_ksoftirqd
>           rcuc/1-14    [001]   413.997681: __do_softirq <-local_bh_enable
>      ksoftirqd/1-15    [000]   413.997682: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.998596: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.998607: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.998608: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.998616: dmvl_debug0 <-run_ksoftirqd
>      ksoftirqd/0-3     [000]   413.999583: ksoftirqd_do_softirq <-run_ksoftirqd
>      ksoftirqd/1-15    [001]   413.999590: ksoftirqd_do_softirq <-run_ksoftirqd
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Would you mind posting the kernel config file (that might help others
to look at appropriate source code paths)? I see that the per-cpu
"rcuc" thread is running on core-1 between the two time-stamps
413.997603 and 413.998608 (which are the time-stamps before and after
when ksoftirqd/1 ran on core-1 correctly and ran on core-0 between
which). "rcuc" is related to the RCU boost feature. I am wondering
whether it might not be a bad idea to disable CONFIG_RCU_BOOST and see
whether the problem still reproduces (not sure whether it will prove
anything definitely but might give a useful data point?).

Do you use cgroup cpusets? I don't think it is related to that though.

-Sankara
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux