On Wed, Aug 30, 2023 at 08:05:39PM +0200, Stefan Wahren wrote: > Hi Jason, > > Am 26.08.23 um 17:48 schrieb Jason A. Donenfeld: > > On Sat, Aug 26, 2023 at 04:01:58PM +0200, Stefan Wahren wrote: > >> Hi Jason, > >> > >> Am 26.08.23 um 14:34 schrieb Jason A. Donenfeld: > >>> On Sat, Aug 26, 2023 at 01:28:28PM +0200, Stefan Wahren wrote: > >>>> The recent RCU stall fix caused a massive throughput regression of the > >>>> hwrng on Raspberry Pi 0 - 3. So try to restore a similiar throughput > >>>> as before the RCU stall fix. > >>>> > >>>> Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig): > >>>> > >>>> sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000 > >>>> > >>>> cpu_relax ~138025 Bytes / sec > >>>> hwrng_msleep(1000) ~13 Bytes / sec > >>>> usleep_range(100,200) ~92141 Bytes / sec > >>>> > >>>> Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()") > >>>> Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@xxxxxxx/ > >>>> Signed-off-by: Stefan Wahren <wahrenst@xxxxxxx> > >>>> --- > >>>> drivers/char/hw_random/bcm2835-rng.c | 3 ++- > >>>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/drivers/char/hw_random/bcm2835-rng.c b/drivers/char/hw_random/bcm2835-rng.c > >>>> index e98fcac578d6..3f1b6aaa98ee 100644 > >>>> --- a/drivers/char/hw_random/bcm2835-rng.c > >>>> +++ b/drivers/char/hw_random/bcm2835-rng.c > >>>> @@ -14,6 +14,7 @@ > >>>> #include <linux/printk.h> > >>>> #include <linux/clk.h> > >>>> #include <linux/reset.h> > >>>> +#include <linux/delay.h> > >>>> > >>>> #define RNG_CTRL 0x0 > >>>> #define RNG_STATUS 0x4 > >>>> @@ -71,7 +72,7 @@ static int bcm2835_rng_read(struct hwrng *rng, void *buf, size_t max, > >>>> while ((rng_readl(priv, RNG_STATUS) >> 24) == 0) { > >>>> if (!wait) > >>>> return 0; > >>>> - hwrng_msleep(rng, 1000); > >>>> + usleep_range(100, 200); > >>> I think we still need to use the hwrng_msleep function so that the sleep > >>> remains cancelable. Maybe just change the 1000 to 100? > >> i found that other hwrng driver like iproc-rng200 (Raspberry Pi 4) also > >> use usleep_range(). > >> > >> Nevertheless here are more numbers: > >> > >> usleep_range(200,400) : 47776 bytes / sec > >> hwrng_msleep(20) : 715 bytes / sec > >> > >> Changing to 100 ms won't be a real gain. > > I'm fine with whatever number you want there. Maybe we need a > > hwrng_usleep_range() that takes into account rng->dying like > > hwrng_msleep() does? (And iproc-rng200 should probably use that too?) > the idea of this patch was to fix the performance regression in upcoming > mainline and backport the fix to Linux 6.1 LTS. After that i'm fine with > the introduction of hwrng_usleep_range(). No, you've got it backwards, but maybe it's a bit confusing why that is. Originally, lots of drivers called variants of the msleep/usleep functions. But this lead to some subtle bugs when the hwrng subsystem was changing the active RNG or when the driver was unloading or during suspend or a couple of other edge cases. In some cases, I believe there were some interesting deadlocks. The solution to it was to make those sleep calls cancellable with the &rng->dying completion, so those sleeps could break, regardless of the time or the state of timekeeping on the system. The &rng->dying thing was encapsulated in the hwrng_msleep() function, and now it's a rule that any sleeps in hwrng drivers have got to go through that. Now, it sounds like you've come up with a case where hwrng_msleep(1) is too long of a sleep for your purposes. In that case, indeed a usleep variant might be called for. But in doing so, don't _reintroduce the same bug we already fixed_. In other words, don't fix one regression by adding another. Instead, follow the established pattern, and add a hwrng_usleep_whatever() that waits on that same &rng->dying. Jason