On Sat, Aug 26, 2023 at 04:01:58PM +0200, Stefan Wahren wrote: > Hi Jason, > > Am 26.08.23 um 14:34 schrieb Jason A. Donenfeld: > > On Sat, Aug 26, 2023 at 01:28:28PM +0200, Stefan Wahren wrote: > >> The recent RCU stall fix caused a massive throughput regression of the > >> hwrng on Raspberry Pi 0 - 3. So try to restore a similiar throughput > >> as before the RCU stall fix. > >> > >> Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig): > >> > >> sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000 > >> > >> cpu_relax ~138025 Bytes / sec > >> hwrng_msleep(1000) ~13 Bytes / sec > >> usleep_range(100,200) ~92141 Bytes / sec > >> > >> Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()") > >> Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@xxxxxxx/ > >> Signed-off-by: Stefan Wahren <wahrenst@xxxxxxx> > >> --- > >> drivers/char/hw_random/bcm2835-rng.c | 3 ++- > >> 1 file changed, 2 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/char/hw_random/bcm2835-rng.c b/drivers/char/hw_random/bcm2835-rng.c > >> index e98fcac578d6..3f1b6aaa98ee 100644 > >> --- a/drivers/char/hw_random/bcm2835-rng.c > >> +++ b/drivers/char/hw_random/bcm2835-rng.c > >> @@ -14,6 +14,7 @@ > >> #include <linux/printk.h> > >> #include <linux/clk.h> > >> #include <linux/reset.h> > >> +#include <linux/delay.h> > >> > >> #define RNG_CTRL 0x0 > >> #define RNG_STATUS 0x4 > >> @@ -71,7 +72,7 @@ static int bcm2835_rng_read(struct hwrng *rng, void *buf, size_t max, > >> while ((rng_readl(priv, RNG_STATUS) >> 24) == 0) { > >> if (!wait) > >> return 0; > >> - hwrng_msleep(rng, 1000); > >> + usleep_range(100, 200); > > > > I think we still need to use the hwrng_msleep function so that the sleep > > remains cancelable. Maybe just change the 1000 to 100? > i found that other hwrng driver like iproc-rng200 (Raspberry Pi 4) also > use usleep_range(). > > Nevertheless here are more numbers: > > usleep_range(200,400) : 47776 bytes / sec > hwrng_msleep(20) : 715 bytes / sec > > Changing to 100 ms won't be a real gain. I'm fine with whatever number you want there. Maybe we need a hwrng_usleep_range() that takes into account rng->dying like hwrng_msleep() does? (And iproc-rng200 should probably use that too?) Jason