On Thu, Jul 14, 2022 at 01:36:22PM +0200, Paolo Bonzini wrote: > On 7/14/22 11:52, Peter Zijlstra wrote: > > On Wed, Jul 13, 2022 at 02:12:41PM -0300, Thadeu Lima de Souza Cascardo wrote: > > > The return thunk call makes the fastop functions larger, just like IBT > > > does. Consider a 16-byte FASTOP_SIZE when CONFIG_RETHUNK is enabled. > > > > > > Otherwise, functions will be incorrectly aligned and when computing their > > > position for differently sized operators, they will executed in the middle > > > or end of a function, which may as well be an int3, leading to a crash > > > like: > > > > Bah.. I did the SETcc stuff, but then forgot about the FASTOP :/ > > > > af2e140f3420 ("x86/kvm: Fix SETcc emulation for return thunks") > > > > > Fixes: aa3d480315ba ("x86: Use return-thunk in asm code") > > > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@xxxxxxxxxxxxx> > > > Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > > > Cc: Borislav Petkov <bp@xxxxxxx> > > > Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx> > > > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > > Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx> > > > --- > > > arch/x86/kvm/emulate.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c > > > index db96bf7d1122..d779eea1052e 100644 > > > --- a/arch/x86/kvm/emulate.c > > > +++ b/arch/x86/kvm/emulate.c > > > @@ -190,7 +190,7 @@ > > > #define X16(x...) X8(x), X8(x) > > > #define NR_FASTOP (ilog2(sizeof(ulong)) + 1) > > > -#define FASTOP_SIZE (8 * (1 + HAS_KERNEL_IBT)) > > > +#define FASTOP_SIZE (8 * (1 + (HAS_KERNEL_IBT | IS_ENABLED(CONFIG_RETHUNK)))) > > > > Would it make sense to do something like this instead? > > Yes, definitely. Applied with a small tweak to make FASTOP_LENGTH > more similar to SETCC_LENGTH: > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c > index db96bf7d1122..0a15b0fec6d9 100644 > --- a/arch/x86/kvm/emulate.c > +++ b/arch/x86/kvm/emulate.c > @@ -189,8 +189,12 @@ > #define X8(x...) X4(x), X4(x) > #define X16(x...) X8(x), X8(x) > -#define NR_FASTOP (ilog2(sizeof(ulong)) + 1) > -#define FASTOP_SIZE (8 * (1 + HAS_KERNEL_IBT)) > +#define NR_FASTOP (ilog2(sizeof(ulong)) + 1) > +#define RET_LENGTH (1 + (4 * IS_ENABLED(CONFIG_RETHUNK)) + \ > + IS_ENABLED(CONFIG_SLS)) > +#define FASTOP_LENGTH (ENDBR_INSN_SIZE + 7 + RET_LENGTH) > +#define FASTOP_SIZE (8 << ((FASTOP_LENGTH > 8) & 1) << ((FASTOP_LENGTH > 16) & 1)) > +static_assert(FASTOP_LENGTH <= FASTOP_SIZE); > struct opcode { > u64 flags; > @@ -442,8 +446,6 @@ static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop); > * RET | JMP __x86_return_thunk [1,5 bytes; CONFIG_RETHUNK] > * INT3 [1 byte; CONFIG_SLS] > */ > -#define RET_LENGTH (1 + (4 * IS_ENABLED(CONFIG_RETHUNK)) + \ > - IS_ENABLED(CONFIG_SLS)) > #define SETCC_LENGTH (ENDBR_INSN_SIZE + 3 + RET_LENGTH) > #define SETCC_ALIGN (4 << ((SETCC_LENGTH > 4) & 1) << ((SETCC_LENGTH > 8) & 1)) > static_assert(SETCC_LENGTH <= SETCC_ALIGN); > > > Paolo > Any hint as to _where_ this was applied to? I'm trying to keep in sync with what is going to Linus "soon" for issues that are affecting the stable trees here. Shouldn't this go through the x86-urgent tree with the other retbleed fixes? thanks, greg k-h