On Thu, Feb 17, 2022 at 7:48 PM Jakob Koschel <jakobkoschel@xxxxxxxxx> wrote: > list_for_each_entry() selects either the correct value (pos) or a safe > value for the additional mispredicted iteration (NULL) for the list > iterator. > list_for_each_entry() calls select_nospec(), which performs > a branch-less select. > > On x86, this select is performed via a cmov. Otherwise, it's performed > via various shift/mask/etc. operations. > > Kasper Acknowledgements: Jakob Koschel, Brian Johannesmeyer, Kaveh > Razavi, Herbert Bos, Cristiano Giuffrida from the VUSec group at VU > Amsterdam. > > Co-developed-by: Brian Johannesmeyer <bjohannesmeyer@xxxxxxxxx> > Signed-off-by: Brian Johannesmeyer <bjohannesmeyer@xxxxxxxxx> > Signed-off-by: Jakob Koschel <jakobkoschel@xxxxxxxxx> Yeah, I think this is the best way to do this without deeply intrusive changes to how lists are represented in memory. Some notes on the specific implementation: > arch/x86/include/asm/barrier.h | 12 ++++++++++++ > include/linux/list.h | 3 ++- > include/linux/nospec.h | 16 ++++++++++++++++ > 3 files changed, 30 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h > index 35389b2af88e..722797ad74e2 100644 > --- a/arch/x86/include/asm/barrier.h > +++ b/arch/x86/include/asm/barrier.h > @@ -48,6 +48,18 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, > /* Override the default implementation from linux/nospec.h. */ > #define array_index_mask_nospec array_index_mask_nospec > > +/* Override the default implementation from linux/nospec.h. */ > +#define select_nospec(cond, exptrue, expfalse) \ > +({ \ > + typeof(exptrue) _out = (exptrue); \ > + \ > + asm volatile("test %1, %1\n\t" \ This shouldn't need "volatile", because it is only necessary if _out is actually used. Using "volatile" here could prevent optimizing out useless code. OPTIMIZER_HIDE_VAR() also doesn't use "volatile". > + "cmove %2, %0" \ > + : "+r" (_out) \ > + : "r" (cond), "r" (expfalse)); \ > + _out; \ > +}) I guess the idea is probably to also add code like this for other important architectures, in particular arm64? It might also be a good idea to rename the arch-overridable macro to something like "arch_select_nospec" and then have a wrapper macro in include/linux/nospec.h that takes care of type safety issues. The current definition of the macro doesn't warn if you pass in incompatible pointer types, like this: int *bogus_pointer_mix(int cond, int *a, long *b) { return select_nospec(cond, a, b); } and if you pass in integers of different sizes, it may silently truncate to the size of the smaller one - this C code: long wrong_int_conversion(int cond, int a, long b) { return select_nospec(cond, a, b); } generates this assembly: wrong_int_conversion: test %edi, %edi cmove %rdx, %esi movslq %esi, %rax ret It might be a good idea to add something like a static_assert(__same_type(...), ...) to protect against that. > /* Prevent speculative execution past this barrier. */ > #define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC) > > diff --git a/include/linux/list.h b/include/linux/list.h > index dd6c2041d09c..1a1b39fdd122 100644 > --- a/include/linux/list.h > +++ b/include/linux/list.h > @@ -636,7 +636,8 @@ static inline void list_splice_tail_init(struct list_head *list, > */ > #define list_for_each_entry(pos, head, member) \ > for (pos = list_first_entry(head, typeof(*pos), member); \ > - !list_entry_is_head(pos, head, member); \ > + ({ bool _cond = !list_entry_is_head(pos, head, member); \ > + pos = select_nospec(_cond, pos, NULL); _cond; }); \ > pos = list_next_entry(pos, member)) I wonder if it'd look nicer to write it roughly like this: #define NOSPEC_TYPE_CHECK(_guarded_var, _cond) \ ({ \ bool __cond = (_cond); \ typeof(_guarded_var) *__guarded_var = &(_guarded_var); \ *__guarded_var = select_nospec(__cond, *__guarded_var, NULL); \ __cond; \ }) #define list_for_each_entry(pos, head, member) \ for (pos = list_first_entry(head, typeof(*pos), member); \ NOSPEC_TYPE_CHECK(head, !list_entry_is_head(pos, head, member)); \ pos = list_next_entry(pos, member)) I think having a NOSPEC_TYPE_CHECK() like this makes it semantically clearer, and easier to add in other places? But I don't know if the others agree... > /** > diff --git a/include/linux/nospec.h b/include/linux/nospec.h > index c1e79f72cd89..ca8ed81e4f9e 100644 > --- a/include/linux/nospec.h > +++ b/include/linux/nospec.h > @@ -67,4 +67,20 @@ int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, > /* Speculation control for seccomp enforced mitigation */ > void arch_seccomp_spec_mitigate(struct task_struct *task); > > +/** > + * select_nospec - select a value without using a branch; equivalent to: > + * cond ? exptrue : expfalse; > + */ > +#ifndef select_nospec > +#define select_nospec(cond, exptrue, expfalse) \ > +({ \ > + unsigned long _t = (unsigned long) (exptrue); \ > + unsigned long _f = (unsigned long) (expfalse); \ > + unsigned long _c = (unsigned long) (cond); \ > + OPTIMIZER_HIDE_VAR(_c); \ > + unsigned long _m = -((_c | -_c) >> (BITS_PER_LONG - 1)); \ > + (typeof(exptrue)) ((_t & _m) | (_f & ~_m)); \ > +}) > +#endif (As a sidenote, it might be easier to implement a conditional zeroing primitive than a generic conditional select primitive if that's all you need, something like: #define cond_nullptr_nospec(_cond, _exp) \ ({ \ unsigned long __exp = (unsigned long)(_exp); \ unsigned long _mask = 0UL - !(_cond); \ OPTIMIZER_HIDE_VAR(_mask); \ (typeof(_exp)) (_mask & __exp); \ }) )