On 9/5/23 09:26, Paul E. McKenney wrote:
On Tue, Sep 05, 2023 at 08:26:51AM -0400, Mathieu Desnoyers wrote:
On 9/5/23 05:31, David Laight wrote:
From: Mathieu Desnoyers
Sent: 04 September 2023 11:24
On 9/4/23 05:42, Denis Arefev wrote:
The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo)
is subject to overflow due to a failure to cast operands to a larger
data type before performing arithmetic
The patch title should identify more precisely its context, e.g.:
"srcu: Fix srcu_struct node grpmask overflow on 64-bit systems"
Also, as I stated in my reply to the previous version, the patch commit
message should describe the impact of the bug it fixes.
And is the analysis complete?
Is 1UL right for 32bit archs??
Is 64 bits even enough??
I understand from include/linux/rcu_node_tree.h and kernel/rcu/Kconfig
RCU_FANOUT and RCU_FANOUT_LEAF ranges that a 32-bit integer is sufficient to
hold the mask on 32-bit architectures, and a 64-bit integer is enough on
64-bit architectures given the current implementation.
At least this appears to be the intent. I did not do a thorough analysis of
the various parameter limits.
Mathieu has it right.
32-bit kernels are unaffected by this bug.
RCU_FANOUT_LEAF defaults to 16, which means that a 64-bit kernel would
need more than 32 leaf rcu_node structures for the parent rcu_node
structure to need more than 32 bit to track its children. This means
that more than 32*16=512 CPUs are required for this bug to affect 64-bit
systems. And there really are systems this big, so I am surprised that
this has not shown up long ago. But it would not be the first time that
objective reality surprised me. ;-)
So with a 64-bit kernel, RCU_FANOUT_LEAF=16, if we have exactly 32 leaf
rcu_node structures (exactly 512 CPUs), we end up in the situation where
the type promotion from signed integer (32-bit) to unsigned long will
carry the sign, and thus create a mask of 0xffffffff80000000.
So if this weird mask is indeed an issue we should state that
configurations _starting from 512 CPUs_ are affected, not just those
with more than 512 CPUs.
Thanks,
Mathieu
Thanx, Paul
Thanks,
Mathieu
David
Thanks,
Mathieu
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Denis Arefev <arefev@xxxxxxxxx>
---
v2: Added fixes to the srcu_schedule_cbs_snp function as suggested by
Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
kernel/rcu/srcutree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 20d7a238d675..6c18e6005ae1 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
snp->grplo = cpu;
snp->grphi = cpu;
}
- sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
+ sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo);
}
smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER);
return true;
@@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp
int cpu;
for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) {
- if (!(mask & (1 << (cpu - snp->grplo))))
+ if (!(mask & (1UL << (cpu - snp->grplo))))
continue;
srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay);
}
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com