On Sun, Oct 15, 2023 at 6:47 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > Hi, > > 在 2023/10/14 5:51, Khazhy Kumykov 写道: > > Looking at the generic mul_u64_u64_div_u64 impl, it doesn't handle > > overflow of the final result either, as far as I can tell. So while on > > x86 we get a DE, on non-x86 we just get the wrong result. > > > > (Aside: after 8d6bbaada2e0 ("blk-throttle: prevent overflow while > > calculating wait time"), setting a very-high bps_limit would probably > > also cause this crash, no?) > > > > Would it be possible to have a "check_mul_u64_u64_div_u64_overflow()", > > where if the result doesn't fit in u64, we indicate (and let the > > caller choose what to do? Here we should just return U64_MAX)? > > > > Absent that, maybe we can take inspiration from the generic > > mul_u64_u64_div_u64? (Forgive the paste) > > > > static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed) > > { > > + /* Final result probably won't fit in u64 */ > > + if (ilog2(bps_limit) + ilog2(jiffy_elapsed) - ilog2(HZ) > 62) > > I'm not sure, but this condition looks necessary, but doesn't look > sufficient, for example, jiffy_elapsed cound be greater than HZ, while > ilog2(jiffy_elapsed) is equal to ilog2(HZ). I believe 62 is correct, although admittedly it's less "intuitive" than the check in mul_u64_u64_div_u64().... The result overflows if log2(A * B / C) >= 64, so we want to ensure that: log2(A) + log2(B) - log2(C) < 64 Given that: ilog2(A) <= log2(A) < ilog2(A) + 1 // truncation defn It follows that: -log2(A) <= -ilog2(A) // Inverse rule log2(A) - 1 < ilog2(A) Starting from: ilog2(A) + ilog2(B) - ilog2(C) <= X We can show: (log2(A) - 1) + (log2(B) - 1) + (-log2(C)) < ilog2(A) + ilog2(B) + (-ilog2(C)) // strict inequality here since the substitutions for A and B terms are strictly less (log2(A) - 1) + (log2(B) - 1) + (-log2(C)) < X log2(A) + log2(B) - log2(C) < X + 2 So for X = 62, log2(A) + log2(B) - log2(C) < 64 must be true, and we must be safe from overflow. So... by converse, if ilog2(A) + ilog2(B) - ilog2(C) > 62, we cannot guarantee that the result will not overflow - thus we bail out. // end math It /is/ less exact than your proposal (sufficient, but not necessary), but saves an extra 128bit mul. I mostly just want us to pick /something/, since 6.6-rc and the LTSs with the patch in question are busted currently. :) > > Thanks, > Kuai > > > + return U64_MAX; > > return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ); > > } > > > > . > > >