On Sun, 2014-02-02 at 21:10 +0100, Sebastian Andrzej Siewior wrote: > According to the backtrace both of them are trying to access the > per-cpu hrtimer (sched_timer) in order to cancel but they seem to fail > to get the timer lock here. They shouldn't spin there for minutes, I > have no idea why they did so… Hm. per-cpu... I've been chasing an rt hotplug heisenbug that is pointing to per-cpu oddness. During sched domain re-construction while running Steven's stress script on 64 core box, we hit a freshly constructed domain with _no span_, build_sched_groups()->get_group() explodes when we meeting it. But if you try to watch the thing appear... it just doesn't. static int build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *attr) { enum s_alloc alloc_state; struct sched_domain *sd; struct s_data d; int i, ret = -ENOMEM; alloc_state = __visit_domain_allocation_hell(&d, cpu_map); if (alloc_state != sa_rootdomain) goto error; /* Set up domains for cpus specified by the cpu_map. */ for_each_cpu(i, cpu_map) { struct sched_domain_topology_level *tl; sd = NULL; for_each_sd_topology(tl) { sd = build_sched_domain(tl, cpu_map, attr, sd, i); BUG_ON(sd == spanless-alien) here.. if (tl == sched_domain_topology) *per_cpu_ptr(d.sd, i) = sd; if (tl->flags & SDTL_OVERLAP || sched_feat(FORCE_SD_OVERLAP)) sd->flags |= SD_OVERLAP; if (cpumask_equal(cpu_map, sched_domain_span(sd))) break; } } /* Build the groups for the domains */ for_each_cpu(i, cpu_map) { for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent) { sd->span_weight = cpumask_weight(sched_domain_span(sd)); if (sd->flags & SD_OVERLAP) { if (build_overlap_sched_groups(sd, i)) goto error; } else { if (build_sched_groups(sd, i)) ..prevents meeting that alien here.. while hotplug locked. static int get_group(int cpu, struct sd_data *sdd, struct sched_group **sg) { struct sched_domain *sd = *per_cpu_ptr(sdd->sd, cpu); struct sched_domain *child = sd->child; if (child) cpu = cpumask_first(sched_domain_span(child)); ^^^nr_cpus if (sg) { *sg = *per_cpu_ptr(sdd->sg, cpu); BOOM -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html