Re: [PATCH] Do not use cpu_to_node() to find an offlined cpu's node.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 9 Oct 2012, Tang Chen wrote:

> > > Eek, the nid shouldn't be -1 yet, though, for cpu hotplug since this
> > > should be called at CPU_DYING level and migrate_tasks() still sees a valid
> > > cpu.
> 
> As Wen said below, nid is now set to -1 when cpu is hotremoved.
> I reproduce this problem in this situation:
> 
> all cpus are online, and hot remove a system board directorily, without
> offlining any cpu.
> 
> As a result, the removed cpu's nid is set to -1, and this causes
> problems.
> 

Let's add Andrew to the cc list then, because I'm nacking 
cpu_hotplug-unmap-cpu2node-when-the-cpu-is-hotremoved.patch in the -mm 
tree for this reason.

We can only clear a cpu-to-node mapping when the cpu is completely 
offline, not before or during the CPU_DYING stage.  Kernel code, such as 
the sched code that you are now trying to "fix", depends on this mapping 
to work correctly; obviously no audit was done of cpu hotplug code 
depending on it before the patch was proposed.

I say "fix" because even this workaround isn't a good solution since it 
would be much better to pick another cpu on the same node as the offlining 
cpu for the runqueue before falling back to the set of all allowed nodes.  
We lose all NUMA affinity information with that patch.  There's no reason 
why we shouldn't know the node of a cpu that is being offlined.

So nack to cpu_hotplug-unmap-cpu2node-when-the-cpu-is-hotremoved.patch.  
After it's removed because it's buggy, this "fix" will no longer be 
necessary.
--
To unsubscribe from this list: send the line "unsubscribe linux-numa" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]     [Devices]

  Powered by Linux