On Wed, Nov 07, 2018 at 03:44:31PM +0000, Will Deacon wrote: > Hi John, > > On Tue, Nov 06, 2018 at 08:39:33PM +0800, John Garry wrote: > > Currently the NUMA distance map parsing does not validate the distance > > table for the distance-matrix rules 1-2 in [1]. > > > > However the arch NUMA code may enforce some of these rules, but not all. > > Such is the case for the arm64 port, which does not enforce the rule that > > the distance between separates nodes cannot equal LOCAL_DISTANCE. > > > > The patch adds the following rules validation: > > - distance of node to self equals LOCAL_DISTANCE > > - distance of separate nodes > LOCAL_DISTANCE > > > > A note on dealing with symmetrical distances between nodes: > > > > Validating symmetrical distances between nodes is difficult. If it were > > mandated in the bindings that every distance must be recorded in the > > table, validating symmetrical distances would be straightforward. However, > > it isn't. > > > > In addition to this, it is also possible to record [b, a] distance only > > (and not [a, b]). So, when processing the table for [b, a], we cannot > > assert that current distance of [a, b] != [b, a] as invalid, as [a, b] > > distance may not be present in the table and current distance would be > > default at REMOTE_DISTANCE. > > > > As such, we maintain the policy that we overwrite distance [a, b] = [b, a] > > for b > a. This policy is different to kernel ACPI SLIT validation, which > > allows non-symmetrical distances (ACPI spec SLIT rules allow it). However, > > the debug message is dropped as it may be misleading (for a distance which > > is later overwritten). > > > > Some final notes on semantics: > > > > - It is implied that it is the responsibility of the arch NUMA code to > > reset the NUMA distance map for an error in distance map parsing. > > > > - It is the responsibility of the FW NUMA topology parsing (whether OF or > > ACPI) to enforce NUMA distance rules, and not arch NUMA code. > > > > [1] Documents/devicetree/bindings/numa.txt > > > > Signed-off-by: John Garry <john.garry@xxxxxxxxxx> > > Is it worth mentioning that the lack of this check was leading to a kernel > crash with a malformed DT entry? So should be marked for stable too? > > > diff --git a/drivers/of/of_numa.c b/drivers/of/of_numa.c > > index 35c64a4295e0..fe6b13608e51 100644 > > --- a/drivers/of/of_numa.c > > +++ b/drivers/of/of_numa.c > > @@ -104,9 +104,14 @@ static int __init of_numa_parse_distance_map_v1(struct device_node *map) > > distance = of_read_number(matrix, 1); > > matrix++; > > > > + if ((nodea == nodeb && distance != LOCAL_DISTANCE) || > > + (nodea != nodeb && distance <= LOCAL_DISTANCE)) { > > + pr_err("Invalid distance[node%d -> node%d] = %d\n", > > + nodea, nodeb, distance); > > + return -EINVAL; > > + } > > + > > numa_set_distance(nodea, nodeb, distance); > > - pr_debug("distance[node%d -> node%d] = %d\n", > > - nodea, nodeb, distance); > > Looks good to me, although I'm not sure which tree this should go through. > > Acked-by: Will Deacon <will.deacon@xxxxxxx> I'll take it. Please resend with the comment Will asked for. Rob