> On 18. Feb 2022, at 17:29, Jann Horn <jannh@xxxxxxxxxx> wrote: > > On Thu, Feb 17, 2022 at 7:48 PM Jakob Koschel <jakobkoschel@xxxxxxxxx> wrote: >> list_for_each_entry() selects either the correct value (pos) or a safe >> value for the additional mispredicted iteration (NULL) for the list >> iterator. >> list_for_each_entry() calls select_nospec(), which performs >> a branch-less select. >> >> On x86, this select is performed via a cmov. Otherwise, it's performed >> via various shift/mask/etc. operations. >> >> Kasper Acknowledgements: Jakob Koschel, Brian Johannesmeyer, Kaveh >> Razavi, Herbert Bos, Cristiano Giuffrida from the VUSec group at VU >> Amsterdam. >> >> Co-developed-by: Brian Johannesmeyer <bjohannesmeyer@xxxxxxxxx> >> Signed-off-by: Brian Johannesmeyer <bjohannesmeyer@xxxxxxxxx> >> Signed-off-by: Jakob Koschel <jakobkoschel@xxxxxxxxx> > > Yeah, I think this is the best way to do this without deeply intrusive > changes to how lists are represented in memory. > > Some notes on the specific implementation: > >> arch/x86/include/asm/barrier.h | 12 ++++++++++++ >> include/linux/list.h | 3 ++- >> include/linux/nospec.h | 16 ++++++++++++++++ >> 3 files changed, 30 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h >> index 35389b2af88e..722797ad74e2 100644 >> --- a/arch/x86/include/asm/barrier.h >> +++ b/arch/x86/include/asm/barrier.h >> @@ -48,6 +48,18 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, >> /* Override the default implementation from linux/nospec.h. */ >> #define array_index_mask_nospec array_index_mask_nospec >> >> +/* Override the default implementation from linux/nospec.h. */ >> +#define select_nospec(cond, exptrue, expfalse) \ >> +({ \ >> + typeof(exptrue) _out = (exptrue); \ >> + \ >> + asm volatile("test %1, %1\n\t" \ > > This shouldn't need "volatile", because it is only necessary if _out > is actually used. Using "volatile" here could prevent optimizing out > useless code. OPTIMIZER_HIDE_VAR() also doesn't use "volatile". > >> + "cmove %2, %0" \ >> + : "+r" (_out) \ >> + : "r" (cond), "r" (expfalse)); \ >> + _out; \ >> +}) > > I guess the idea is probably to also add code like this for other > important architectures, in particular arm64? yes indeed, with a fallback of using the shifting/masking mechanism for other archs. > > > It might also be a good idea to rename the arch-overridable macro to > something like "arch_select_nospec" and then have a wrapper macro in > include/linux/nospec.h that takes care of type safety issues. > > The current definition of the macro doesn't warn if you pass in > incompatible pointer types, like this: > > int *bogus_pointer_mix(int cond, int *a, long *b) { > return select_nospec(cond, a, b); > } > > and if you pass in integers of different sizes, it may silently > truncate to the size of the smaller one - this C code: > > long wrong_int_conversion(int cond, int a, long b) { > return select_nospec(cond, a, b); > } > > generates this assembly: > > wrong_int_conversion: > test %edi, %edi > cmove %rdx, %esi > movslq %esi, %rax > ret > > It might be a good idea to add something like a > static_assert(__same_type(...), ...) to protect against that. These are good points, thank you for your input. Will be good to incorporate. > >> /* Prevent speculative execution past this barrier. */ >> #define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC) >> >> diff --git a/include/linux/list.h b/include/linux/list.h >> index dd6c2041d09c..1a1b39fdd122 100644 >> --- a/include/linux/list.h >> +++ b/include/linux/list.h >> @@ -636,7 +636,8 @@ static inline void list_splice_tail_init(struct list_head *list, >> */ >> #define list_for_each_entry(pos, head, member) \ >> for (pos = list_first_entry(head, typeof(*pos), member); \ >> - !list_entry_is_head(pos, head, member); \ >> + ({ bool _cond = !list_entry_is_head(pos, head, member); \ >> + pos = select_nospec(_cond, pos, NULL); _cond; }); \ >> pos = list_next_entry(pos, member)) > > I wonder if it'd look nicer to write it roughly like this: > > #define NOSPEC_TYPE_CHECK(_guarded_var, _cond) \ > ({ \ > bool __cond = (_cond); \ > typeof(_guarded_var) *__guarded_var = &(_guarded_var); \ > *__guarded_var = select_nospec(__cond, *__guarded_var, NULL); \ > __cond; \ > }) > > #define list_for_each_entry(pos, head, member) \ > for (pos = list_first_entry(head, typeof(*pos), member); \ > NOSPEC_TYPE_CHECK(head, !list_entry_is_head(pos, head, member)); \ > pos = list_next_entry(pos, member)) > > I think having a NOSPEC_TYPE_CHECK() like this makes it semantically > clearer, and easier to add in other places? But I don't know if the > others agree... That sounds like a good idea. I wonder if the pointer and dereference in NOSPEC_TYPE_CHECK() simply get optimized away. Or why you can't simply use _guarded_var directly instead of a pointer to it. > >> /** >> diff --git a/include/linux/nospec.h b/include/linux/nospec.h >> index c1e79f72cd89..ca8ed81e4f9e 100644 >> --- a/include/linux/nospec.h >> +++ b/include/linux/nospec.h >> @@ -67,4 +67,20 @@ int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, >> /* Speculation control for seccomp enforced mitigation */ >> void arch_seccomp_spec_mitigate(struct task_struct *task); >> >> +/** >> + * select_nospec - select a value without using a branch; equivalent to: >> + * cond ? exptrue : expfalse; >> + */ >> +#ifndef select_nospec >> +#define select_nospec(cond, exptrue, expfalse) \ >> +({ \ >> + unsigned long _t = (unsigned long) (exptrue); \ >> + unsigned long _f = (unsigned long) (expfalse); \ >> + unsigned long _c = (unsigned long) (cond); \ >> + OPTIMIZER_HIDE_VAR(_c); \ >> + unsigned long _m = -((_c | -_c) >> (BITS_PER_LONG - 1)); \ >> + (typeof(exptrue)) ((_t & _m) | (_f & ~_m)); \ >> +}) >> +#endif > > (As a sidenote, it might be easier to implement a conditional zeroing > primitive than a generic conditional select primitive if that's all > you need, something like: > > #define cond_nullptr_nospec(_cond, _exp) \ > ({ \ > unsigned long __exp = (unsigned long)(_exp); \ > unsigned long _mask = 0UL - !(_cond); \ > OPTIMIZER_HIDE_VAR(_mask); \ > (typeof(_exp)) (_mask & __exp); \ > }) > > ) Ah yes, if NULL is actually the value to choose, this might be good enough.