On Wed, Aug 30, 2023 at 11:44:31AM +0100, Phil Elwell wrote: > Hi all, > > On Wed, 30 Aug 2023 at 11:38, Thorsten Leemhuis > <regressions@xxxxxxxxxxxxx> wrote: > > > > /me gets the impression he has to chime in here > > > > On 25.08.23 14:14, Stefan Wahren wrote: > > > Am 25.08.23 um 13:26 schrieb Jason A. Donenfeld: > > >> On Fri, Aug 25, 2023 at 01:14:55PM +0200, Stefan Wahren wrote: > > > > > >>> i didn't find the time to fix the performance regression in bcm2835-rng > > >>> which affects Raspberry Pi 0 - 3, so report it at least. AFAIK the first > > >>> report about this issue was here [1] and identified the offending > > >>> commit: > > >>> > > >>> 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of > > >>> cpu_relax()") > > >>> > > >>> #regzbot introduced: 96cb9d055445 > > >>> > > >>> I was able to reproduce this issue with a Raspberry Pi 3 B+ on Linux > > >>> 6.5-rc6 (arm64/defconfig). > > >>> > > >>> Before: > > >>> time sudo dd if=/dev/hwrng of=/dev/urandom count=1 bs=4096 status=none > > >>> > > >>> real 3m29,002s > > >>> user 0m0,018s > > >>> sys 0m0,054s > > >> That's not surprising. But also, does it matter? That script has > > >> *always* been wrong. Writing to /dev/urandom like that has *never* > > >> ensured that those bytes are taken into account immediately after. It's > > >> just not how that interface works. So any assumptions based on that are > > >> bogus, and that line effectively does nothing. > > >> > > >> Fortunately, however, the kernel itself incorporates hwrng output into > > >> the rng pool, so you don't need to think about doing it yourself. > > >> > > >> So go ahead and remove that line from your script. > > > > > > Thanks for your explanation. Unfortunately this isn't my script. > > > > And I assume it's in the standard install of the RpiOS or similarly > > widespread? > > > > > I'm > > > just a former BCM2835 maintainer and interested that more user stick to > > > the mainline kernel instead of the vendor ones. I will try to report the > > > script owner. > > > > thx > > > > >> Now as far as the "regression" goes, we've made an already broken > > >> userspace script take 3 minutes longer than usual, but it still does > > >> eventually complete, so it's not making boot impossible or something. > > >> How this relates to the "don't break userspace" rule might be a matter > > >> of opinion. > > > > Yup, but I'd say it bad enough to qualify as regression. If it would be > > something like 10 seconds it might be something different, but 3 minutes > > will look like a hang to many people, and I'm pretty sure that's > > something Linus doesn't want to see. But let's not involve him for now > > and first try to solve this differently. > > > > >> If you think it does, maybe send a patch to Herbert reducing > > >> that sleep from 1000 to 100 and stating why with my background above, > > >> and see if he agrees it's worth fixing. > > > > Stefan, did you try to see how long it would take when the sleep time is > > reduced? I guess that might be our best chance to solve this, as > > reverting the culprit afaics would lead to regressions for others. > > > > /me wonders if the sleep time could even be reduced futher that 100 > > FYI, this is how I tackled it downstream: > > https://github.com/raspberrypi/linux/commit/6a825ed68f75bd768e31110ba825b75c5c09cf2d > > I can send a patch if it looks appropriate for upstream use. > > Phil Upstream discussion: https://lore.kernel.org/all/ZOoe0lOR9zpcAw5I@xxxxxxxxx/