Hi Toke, On Tue, Dec 6, 2022 at 1:50 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > > "Jason A. Donenfeld" <Jason@xxxxxxxxx> writes: > > > On Mon, Dec 05, 2022 at 11:21:51PM +0100, Daniel Borkmann wrote: > >> On 12/5/22 7:15 PM, Jason A. Donenfeld wrote: > >> > Since BPF's bpf_user_rnd_u32() was introduced, there have been three > >> > significant developments in the RNG: 1) get_random_u32() returns the > >> > same types of bytes as /dev/urandom, eliminating the distinction between > >> > "kernel random bytes" and "userspace random bytes", 2) get_random_u32() > >> > operates mostly locklessly over percpu state, 3) get_random_u32() has > >> > become quite fast. > >> > >> Wrt "quite fast", do you have a comparison between the two? Asking as its > >> often used in networking worst case on per packet basis (e.g. via XDP), would > >> be useful to state concrete numbers for the two on a given machine. > > > > Median of 25 cycles vs median of 38, on my Tiger Lake machine. So a > > little slower, but too small of a difference to matter. > > Assuming a 3Ghz CPU clock (so 3 cycles per nanosecond), that's an > additional overhead of ~4.3 ns. When processing 10 Gbps at line rate > with small packets, the per-packet processing budget is 67.2 ns, so > those extra 4.3 ns will eat up ~6.4% of the budget. > > So in other words, "too small a difference to matter" is definitely not > true in general. It really depends on the use case; if someone is using > this to, say, draw per-packet random numbers to compute a drop frequency > on ingress, that extra processing time will most likely result in a > quite measurable drop in performance. Huh, neat calculation, I'll keep that method in mind. Alright, sorry for the noise here. I'll check back in if I ever manage to eliminate that performance gap. Jason