Hi Julia, Thomas, Thanks for all the inputs. The information is really helpful in helping me get a better understanding of the internals. :) On Fri, 2017-03-03 at 22:09 +0100, Thomas Gleixner wrote: > On Fri, 3 Mar 2017, Julia Cartwright wrote: > > > > > > Without PREEMPT_RT_FULL enabled, the critical section is executed > > with > > "raw" spinlocks, and is therefore non-preemptible. However, with > > RT_FULL, the preemptibility of the section leads to the "bounce". > > > > That should make it clear why ktimersoftd would be PI boosted, as > > well. > > > > Now, it isn't clear to me why the affinitized scenario appears to > > make > > this happen more frequently... Nor do I have a handle on what to > > do to > > fix this (if anything). > The point here is: > > perf stat taskset 1 cyclictest -t1 > > will make the control thread of cyclictest affine to cpu 0 and also > the > measuring thread. perf stat counts the context switches of both. I believe that it is not manifested in the non-affince scenario because the cyclictest process being migrated to another CPU every time it goes to sleep after calling sigwait(). There are no "additional" context switches required between ktimersoftd and cyclictest because it is on a different CPU. Also, I am using the '-a' argument provided by cyclictest to pin the process to the CPU. This does not pin the main thread. I confirmed it by using trace-cmd and kernelshark. Also, the number of context switches for the main thread are pretty low. For example, if I run 100000 loops, there are ~300000 context switches out of which only 1000 are from the main thread. I checked in the mainline kernel and the softirqs seem to be executing with the interrupts disabled. Which is probably why the issue is avoided. I now have the following concerns and comments: 1. real-time kernel vs. Mainline kernel: The real-time kernel is worse with POSIX timers than the mainline kernel. This is odd. Is this because the softirqs are not the same anymore(Sorry, I am still not familiar with what they have become. I deduced the "not the same" part from commit messages/comments in code. :))? Also, in https://git.kernel .org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.4- rt&id=4aa7cba57f73acf1e3e4998ae1650965317c2de1, it is mentioned: " Bring back the softirq split for now, until we fixed the signal delivery problem for real. " What signal delivery problem is being referred to? 2. In the case the CPU affinity is set, The CPU is shared between cyclictest thread and the ktimersoftd thread. So, in the end, 1 CPU migration is cheaper than 3 context switches... Is this the right analysis? I know this will vary from application to application. Thanks, Vedang Patel Software Engineer Intel Corporation > CPU 0 > cyclictest-control > > --> Interrupt > > ksoftirqd > > cyclictest-measure > rearm timer > sleep > > cyclictest-control > > .... > > versus a non affine scenario > > CPU0 CPU1 CPU2 > cyclictest-control interrupt > ksoftirqd --> cyclictest-measure > rearm timer > sleep > > interrupt > cyclicttest <-- ksoftirqd > -measure > > Thanks, > > tglx��.n��������+%������w��{.n�����{�����ǫ���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f