Hi, I ran cgroup_fj tests on RT kernel with PREEMPT_RT_FULL disabled, it will stick the system when ran cpuset stress tests, it happens everytime. Here stick the system means there are almost no response from the system and we can hardly do anything on the terminal, but kernel isn't crash nor deadlocked (according to the lockdep message), and it may do some response sometimes. The problem exists on all RT versions from 3.4.18-rt29 to 3.4.37-rt51 AFAIK, but without RT patches or with PREEMPT_RT_FULL enabled, the problem isn't exists. When the system is stuck, we will get the following message: # dmesg ... [96967.772181] NOHZ: local_softirq_pending 200 [96967.776398] NOHZ: local_softirq_pending 200 [96967.780212] NOHZ: local_softirq_pending 200 [96967.781215] NOHZ: local_softirq_pending 200 [96967.784152] NOHZ: local_softirq_pending 200 [96967.784310] NOHZ: local_softirq_pending 200 [96967.788239] NOHZ: local_softirq_pending 200 [96967.796092] NOHZ: local_softirq_pending 200 [96967.800089] NOHZ: local_softirq_pending 200 [96967.800225] NOHZ: local_softirq_pending 200 [97112.950055] ------------[ cut here ]------------ [97112.950068] WARNING: at /usr/src/packages/BUILD/kernel-default-3.4.24.03/linux-3.4/kernel/workqueue.c:1208 worker_enter_idle+0x1d3/0x200() [97112.950073] Hardware name: Tecal RH2285 [97112.950076] Modules linked in: reiserfs minix hfs vfat fat tun xt_limit xt_tcpudp nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 x_tables dummy edd cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf loop dm_mod coretemp crc32c_intel igb ghash_clmulni_intel aesni_intel cryptd aes_x86_64 aes_generic iTCO_wdt bnx2 iTCO_vendor_support i7core_edac pcspkr i2c_i801 dca edac_core button rtc_cmos microcode serio_raw i2c_core ses enclosure sg mptctl ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw scsi_dh_rdac scsi_dh ata_generic ata_piix libata mptsas mptscsih mptbase scsi_transport_sas scsi_mod [last unloaded: ip_tables] [97112.950178] Pid: 5331, comm: kworker/0:2 Tainted: GF WC 3.4.24.03-0.1.2-default #1 [97112.950182] Call Trace: [97112.950191] [<ffffffff8105e2d2>] warn_slowpath_common+0xb2/0x120 [97112.950196] [<ffffffff8105e365>] warn_slowpath_null+0x25/0x30 [97112.950202] [<ffffffff81085593>] worker_enter_idle+0x1d3/0x200 [97112.950207] [<ffffffff81084a95>] ? need_to_create_worker+0x15/0x50 [97112.950213] [<ffffffff8108a308>] worker_thread+0x2a8/0x4f0 [97112.950218] [<ffffffff8108a060>] ? rescuer_thread+0x320/0x320 [97112.950226] [<ffffffff81091d86>] kthread+0xc6/0xe0 [97112.950233] [<ffffffff81720454>] kernel_thread_helper+0x4/0x10 [97112.950239] [<ffffffff81091cc0>] ? __init_kthread_worker+0x50/0x50 [97112.950244] [<ffffffff81720450>] ? gs_change+0x13/0x13 [97112.950248] ---[ end trace 61f48fadbd018007 ]--- Here is a sample version of cgroup_fj which can trigger this problem everytime: (make sure we have CONFIG_CGROUPS and CONFIG_CPUSET endabled :)) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> # cat cgroup_fj.sh #! /bin/sh LOGFILE=./cgroup_fj-output.txt TMPFILE=/tmp/cgroup_fj_tempfile.txt subsystem=2 subsystem_name="cpuset" subgroup_num=100 cur_subgroup_path1="" get_subgroup_path1() { cur_subgroup_path1="" if [ "$#" -ne 1 ] || [ "$1" -lt 1 ] ; then return; fi cur_subgroup_path1="/dev/cgroup/subgroup_$1/" } cleanup() { mount_str="`mount -l | grep /dev/cgroup`" if [ "$mount_str" != "" ]; then umount /dev/cgroup fi if [ -e /dev/cgroup ]; then rmdir /dev/cgroup fi } setup() { mkdir /dev/cgroup mount -t cgroup -o $subsystem_name cgroup /dev/cgroup } reclaim_foundling() { cat `find /dev/cgroup/subgroup_* -name "tasks"` > $TMPFILE nlines=`cat "$TMPFILE" | wc -l` for k in `seq 1 $nlines` do cur_pid=`sed -n "$k""p" $TMPFILE` if [ -e /proc/$cur_pid/ ];then echo "pid $cur_pid reclaimed" echo "$cur_pid" > "/dev/cgroup/tasks" fi done } ########################## main ####################### echo "-------------------------------------------------------------------------" >> $LOGFILE cleanup; setup; if [ $subsystem -eq 2 ]; then cpus=`cat /dev/cgroup/cpuset.cpus` mems=`cat /dev/cgroup/cpuset.mems` fi count=0 pathes[1]="" for i in `seq 1 $subgroup_num` do get_subgroup_path1 $i mkdir $cur_subgroup_path1 if [ $subsystem -eq 2 ]; then echo "$cpus" > "$cur_subgroup_path1""cpuset.cpus" echo "$mems" > "$cur_subgroup_path1""cpuset.mems" fi let "count = $count + 1" pathes[$count]="$cur_subgroup_path1" done echo "...mkdired $count times" >> $LOGFILE sleep 1 count2=$count let "count2 = $count2 + 1" pathes[0]="/dev/cgroup/" pathes[$count2]="/dev/cgroup/" for i in `seq 0 $count` do j=$i let "j = $j + 1" cat "${pathes[$i]}tasks" > $TMPFILE nlines=`cat "$TMPFILE" | wc -l` for k in `seq 1 $nlines` do cur_pid=`sed -n "$k""p" $TMPFILE` if [ -e /proc/$cur_pid/ ];then echo "$cur_pid" > "${pathes[$j]}tasks" echo "task: $cur_pid" >> $LOGFILE echo "target: ${pathes[$j]}tasks}" >> $LOGFILE fi done done reclaim_foundling; for i in `seq 1 $count` do j=i let "j = $count - $j + 1" rmdir ${pathes[$j]} done sleep 1 cleanup; exit 0; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html