On Mon, Apr 22, 2013 at 12:00:47PM -0400, Steven Rostedt wrote: | On Mon, 2013-04-22 at 17:39 +0800, Li Zefan wrote: | > On 2013/4/19 15:30, Qiang Huang wrote: | > > Hi, | > > | > > I ran cgroup_fj tests on RT kernel with PREEMPT_RT_FULL disabled, it will | > > stick the system when ran cpuset stress tests, it happens everytime. | > > | > > Here stick the system means there are almost no response from the system and | > > we can hardly do anything on the terminal, but kernel isn't crash nor deadlocked | > > (according to the lockdep message), and it may do some response sometimes. | > > | > > The problem exists on all RT versions from 3.4.18-rt29 to 3.4.37-rt51 AFAIK, but | > > without RT patches or with PREEMPT_RT_FULL enabled, the problem isn't exists. | > > | > > When the system is stuck, we will get the following message: | > > # dmesg | > > ... | > | > I've found the culprit after some investigation: | > | > From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> | > Date: Fri, 04 Nov 2011 19:48:36 +0000 | > Subject: sched-clear-pf-thread-bound-on-fallback-rq.patch | > | > At system boot when some cpus haven't been up, the scheduler calls select_fallback_rq() | > and schedules tasks in other cpus, which ends up clearing some kernel threads' | > PF_THREAD_BOUND flag... | | I'm curious to why this doesn't break when PREEMPT_RT_FULL is enabled. I | would think it would also cause issues there too. I does break when PREEMPT_RT_FULL is enabled :) I was able to consistently reproduce the issue on the latest 3.6-rt kernel this weekend. And I was also able to confirm that the patch in this thread did mitigate the issue. Cheers, Luis -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html