On Tue 24-09-19 11:17:14, Peter Zijlstra wrote: > On Tue, Sep 24, 2019 at 09:47:51AM +0200, Michal Hocko wrote: > > On Mon 23-09-19 22:34:10, Peter Zijlstra wrote: > > > On Mon, Sep 23, 2019 at 06:52:35PM +0200, Michal Hocko wrote: > > [...] > > > > I even the > > > > ACPI standard is considering this optional. Yunsheng Lin has referred to > > > > the specific part of the standard in one of the earlier discussions. > > > > Trying to guess the node affinity is worse than providing all CPUs IMHO. > > > > > > I'm saying the ACPI standard is wrong. > > > > Even if you were right on this the reality is that a HW is likely to > > follow that standard and we cannot rule out NUMA_NO_NODE being > > specified. As of now we would access beyond the defined array and that > > is clearly a bug. > > Right, because the device node is wrong, so we fix _that_! > > > Let's assume that this is really a bug for a moment. What are you going > > to do about that? BUG_ON? I do not really see any solution besides to either > > provide something sensible or BUG_ON. If you are worried about a > > conditional then this should be pretty easy to solve by starting the > > array at -1 index and associate it with the online cpu mask. > > The same thing I proposed earlier; force the device node to 0 (or any > other convenient random valid value) and issue a FW_BUG message to the > console. Why would you "fix" anything and how do you know that node 0 is the right choice? I have seen setups with node 0 without any memory and similar unexpected things. To be honest I really fail to see why to object to a simple semantic that NUMA_NO_NODE imply all usable cpus. Could you explain that please? -- Michal Hocko SUSE Labs