Re: [GIT PULL] memblock:fix validation of NUMA coverage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 13, 2024 at 10:38:28AM -0700, Linus Torvalds wrote:
> On Thu, 13 Jun 2024 at 10:09, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Is there some broken scripting that people have started using (or have
> > been using for a while and was recently broken)?
> 
> ... and then when I actually pull the code, I note that the problem
> where it checked _one_ bogus value has just been replaced with
> checking _another_ bogus value.
> 
> Christ.
> 
> What if people use a node ID that is simply outside the range
> entirely, instead of one of those special node IDs?
> 
> And now for memblock_set_node() you should apparently use NUMA_NO_NODE
> to not get a warning, but for memblock_set_region_node() apparently
> the right random constant to use is MAX_NUMNODES.
> 
> Does *any* of this make sense? No.
> 
> How about instead of having two random constants - and not having any
> range checking that I see - just have *one* random constant for "I
> have no range", call that NUMA_NO_NODE, and then have a simple helper
> for "do I have a valid range", and make that be
> 
>    static inline bool numa_valid_node(int nid)
>    { return (unsigned int)nid < MAX_NUMNODES; }
> 
> or something like that? Notice that now *all* of
> 
>  - NUMA_NO_NODE (explicitly no node)
> 
>  - MAX_NUMNODES (randomly used no node)
> 
>  - out of range node (who knows wth firmware tables do?)
> 
> will get the same result from that "numa_valid_node()" function.
> 
> And at that point you don't need to care, you don't need to warn, and
> you don't need to have these insane rules where "sometimes you *HAVE*
> to use NUMA_NO_NODE, or we warn, in other cases MAX_NUMNODES is the
> thing".
> 
> Please? IOW, instead of adding a warning for fragile code, then change
> some caller to follow the new rules, JUST FIX THE STUPID FRAGILITY!
> 
> Or hey, just do
> 
>     #define NUMA_NO_NODE MAX_NUMNODES
> 
> and have two names for the *same* constant, instead fo having two
> different constants with strange semantic differences that seem to
> make no sense and where the memblock code itself seems to go
> back-and-forth on it in different contexts.

A single constant is likely to backfire because I remember seeing checks
like 'if (nid < 0)' so redefining NUMA_NO_NODE will require auditing all
those.

But a helper function works great.
I could only lightly test it as I don't have a fleet of machines with
variety of memory layouts, so I'm planning to push it into -next early next
week (with subject replaced by a more informative one)


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux