On Thu, Mar 26, 2020 at 09:31:32AM -0700, Kees Cook wrote: > On Thu, Mar 26, 2020 at 11:15:21AM +0000, Mark Rutland wrote: > > On Wed, Mar 25, 2020 at 01:22:07PM -0700, Kees Cook wrote: > > > On Wed, Mar 25, 2020 at 01:21:27PM +0000, Mark Rutland wrote: > > > > On Tue, Mar 24, 2020 at 01:32:31PM -0700, Kees Cook wrote: > > > > > Allow for a randomized stack offset on a per-syscall basis, with roughly > > > > > 5 bits of entropy. > > > > > > > > > > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx> > > > > > > > > Just to check, do you have an idea of the impact on arm64? Patch 3 had > > > > figures for x86 where it reads the TSC, and it's unclear to me how > > > > get_random_int() compares to that. > > > > > > I didn't do a measurement on arm64 since I don't have a good bare-metal > > > test environment. I know Andy Lutomirki has plans for making > > > get_random_get() as fast as possible, so that's why I used it here. > > > > Ok. I suspect I also won't get the chance to test that in the next few > > days, but if I do I'll try to share the results. > > Okay, thanks! I can try a rough estimate under emulation, but I assume > that'll be mostly useless. :) > > > My concern here was that, get_random_int() has to grab a spinlock and > > mess with IRQ masking, so has the potential to block for much longer, > > but that might not be an issue in practice, and I don't think that > > should block these patches. > > Gotcha. I was already surprised by how "heavy" the per-cpu access was > when I looked at the resulting assembly (there looked to be preempt > stuff, etc). But my hope was that this is configurable so people can > measure for themselves if they want it, and most people who want this > feature have a high tolerance for performance trade-offs. ;) > > > > I couldn't figure out if there was a comparable instruction like rdtsc > > > in aarch64 (it seems there's a cycle counter, but I found nothing in > > > the kernel that seemed to actually use it)? > > > > AArch64 doesn't have a direct equivalent. The generic counter > > (CNTxCT_EL0) is the closest thing, but its nominal frequency is > > typically much lower than the nominal CPU clock frequency (unlike TSC > > where they're the same). The cycle counter (PMCCNTR_EL0) is part of the > > PMU, and can't be relied on in the same way (e.g. as perf reprograms it > > to generate overflow events, and it can stop for things like WFI/WFE). > > Okay, cool; thanks for the details! It's always nice to confirm I didn't > miss some glaringly obvious solution. ;) > > For a potential v2, should I add your reviewed-by or wait for your > timing analysis, etc? I'd rather not give an R-b until I've seen numbers, but please don't block waiting for that. For the moment, feel free to add: Acked-by: Mark Rutland <mark.rutland@xxxxxxx> ... and it's down to Will and Catalin to make the call for arm64. Thanks, Mark.