On 9/5/23 10:15, David Laight wrote:
...
That would instead be more than 512-16=496 CPUs, correct? 496 CPUs would
only require a 31-bit shift, which should be OK, but 497 would require
a 32-bit shift, which would result in sign extension. If it turns out
that sign extension is OK, then we should get in trouble at 513 CPUs,
which would result in a 33-bit shift (and is that even defined in C?).
Not quite right :-)
(1 << 31) is int and negative, that gets sign extended before
being converted to 'unsigned long' - so has the top 33 bits set.
(1 << 32) is undefined, the current x86 cpu ignore the high
shift bits so it is (1 << 0).
Yes, I was about to reply the same thing. A shift of 31 is buggy,
because shifting 1 << 31 raises the sign bit, which sets the top 33
bits when cast to unsigned long. A shift of 1 << 32 is undefined,
with for instance x86 choosing to ignore the top bit.
But in order to have a 1 << 31 shift from this expression:
sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
I *think* we need the group to have 32 cpus or more (indexed within
the group from grplo to grplo + 31 (both inclusive)).
So as soon as we have one group with 32 cpus, the problem should show
up. With FANOUT_LEAF=16, we can have 15 groups of 31 cpus and 1
group of 32 cpus, for:
15*31 + 32 = 497 cpus.
AFAIU, this would be the minimum value for smp_processor_id()+1 which
triggers this issue.
Thanks,
Mathieu
If the mask is being used to optimise a search the code might
just happen to work!
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com