On Tue, Dec 15, 2009 at 9:47 PM, Sachin Sant <sachinp@xxxxxxxxxx> wrote: > Peter Zijlstra wrote: >>> >>> I added some debug statements within the above code. This is a 2 cpu >>> machine. >>> >>> XMON dest_cpu = 1024 . dead_cpu = 1 . nr_cpu_ids = 2 >>> XMON dest_cpu = 1024 XMON dest_cpu = 1024 . dead_cpu = 1 >>> XMON dest_cpu = 1024 . dead_cpu = 1 . nr_cpu_ids = 2 >>> XMON dest_cpu = 1024 XMON dest_cpu = 1024 . dead_cpu = 1 >>> XMON dest_cpu = 1024 . dead_cpu = 1 . nr_cpu_ids = 2 >>> XMON dest_cpu = 1024 XMON dest_cpu = 1024 . dead_cpu = 1 >>> >>> Seems to me that the control is stuck in an infinite loop and hence the >>> machine appears to be in hung state. The dest_cpu value is always 1024 >>> and never changes, which result in an infinite loop. >>> >>> In working scenario the o/p is something on the following lines >>> >>> XMON dest_cpu = 1024 . dead_cpu = 1 . nr_cpu_ids = 2 >>> XMON dest_cpu = 0 XMON dest_cpu = 1024 . dead_cpu = 1 . nr_cpu_ids = 2 >>> XMON dest_cpu = 0 XMON dest_cpu = 1024 . dead_cpu = 1 . nr_cpu_ids = 2 >>> XMON dest_cpu = 0 >>> Let me know if i should try to record any specific value ? >>> >> >> Could you possibly print the two masks themselves? cpumask_scnprintf() >> and friend come in handy for this. >> >> The dest_cpu=1024 thing seem to suggest the intersection between >> p->cpus_allowed and cpu_active_mask is empty for some reason, even >> though we forcefully reset p->cpus_allowed to the full set using >> cpuset_cpus_allowed_locked(). >> > > So here is the data related to the two masks. > > cpu_active_mask = 00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000 > XMON dest_cpu = 1024 > How about cpu_online_mask? commit 6ad4c1 switches from cpu_online_mask to cpu_active_mask. Is there a mismatch for cpu_online_mask and cpu_active_mask? > while p->cpus_allowed = 00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000001 > XMON dest_cpu = 1024 > > In working scenario the above data looks like > > cpu_active_mask = 00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000002 > XMON dest_cpu = 1 > > while p->cpus_allowed = 00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000, > 00000000,00000000,00000002 > XMON dest_cpu = 1 > > > hope i got the data correct. > > Thanks > -Sachin > > > -- > > --------------------------------- > Sachin Sant > IBM Linux Technology Center > India Systems and Technology Labs > Bangalore, India > --------------------------------- > > -- > To unsubscribe from this list: send the line "unsubscribe linux-next" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-next" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html