sorry for the delay. > Well what's actually happening here? Where is the alleged deadlock? > > In the kernel_init() case we have a GFP_KERNEL allocation inside > get_online_cpus(). In the other case we simply have kswapd calling > get_online_cpus(), yes? Yes. > > Does lockdep consider all kswapd actions to be "in reclaim context"? > If so, why? kswapd call lockdep_set_current_reclaim_state() at thread starting time. see below. ---------------------------------------------------------------------- static int kswapd(void *p) { unsigned long order; pg_data_t *pgdat = (pg_data_t*)p; struct task_struct *tsk = current; struct reclaim_state reclaim_state = { .reclaimed_slab = 0, }; const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); lockdep_set_current_reclaim_state(GFP_KERNEL); ...... ---------------------------------------------------------------------- > > I think we have two option 1) call lockdep_clear_current_reclaim_state() > > every time 2) use for_each_possible_cpu instead for_each_online_cpu. > > > > Following patch use (2) beucase removing get_online_cpus() makes good > > side effect. It reduce potentially cpu-hotplug vs memory-shortage deadlock > > risk. > > Well. Being able to run for_each_online_cpu() is a pretty low-level > and fundamental thing. It's something we're likely to want to do more > and more of as time passes. It seems a bad thing to tell ourselves > that we cannot use it in reclaim context. That blots out large chunks > of filesystem and IO-layer code as well! > > > --- a/mm/vmstat.c > > +++ b/mm/vmstat.c > > @@ -193,18 +193,16 @@ void set_pgdat_percpu_threshold(pg_data_t *pgdat, > > int threshold; > > int i; > > > > - get_online_cpus(); > > for (i = 0; i < pgdat->nr_zones; i++) { > > zone = &pgdat->node_zones[i]; > > if (!zone->percpu_drift_mark) > > continue; > > > > threshold = (*calculate_pressure)(zone); > > - for_each_online_cpu(cpu) > > + for_each_possible_cpu(cpu) > > per_cpu_ptr(zone->pageset, cpu)->stat_threshold > > = threshold; > > } > > - put_online_cpus(); > > } > > That's a pretty sad change IMO, especially of num_possible_cpus is much > larger than num_online_cpus. As far as I know, CPU hotplug is used server area and almost server have ACPI or similar flexible firmware interface. then, num_possible_cpus is not so much big than actual numbers of socket. IOW, I haven't hear embedded people use cpu hotplug. If you've hear, please let me know. > What do we need to do to make get_online_cpus() safe to use in reclaim > context? (And in kswapd context, if that's really equivalent to > "reclaim context"). Hmm... It's too hard. kmalloc() is called from everywhere and cpu hotplug is happen any time. then, any lock design break your requested rule. ;) And again, _now_ I don't think for_each_possible_cpu() is very costly. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>