On 30 January 2018 at 12:27, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: > On 29/01/18 21:45, Ard Biesheuvel wrote: >> On 29 January 2018 at 17:45, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: >>> One of the major improvement of SMCCC v1.1 is that it only clobbers >>> the first 4 registers, both on 32 and 64bit. This means that it >>> becomes very easy to provide an inline version of the SMC call >>> primitive, and avoid performing a function call to stash the >>> registers that would otherwise be clobbered by SMCCC v1.0. >>> >>> Signed-off-by: Marc Zyngier <marc.zyngier@xxxxxxx> >>> --- >>> include/linux/arm-smccc.h | 157 ++++++++++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 157 insertions(+) >>> >>> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h >>> index dd44d8458c04..bc5843728909 100644 >>> --- a/include/linux/arm-smccc.h >>> +++ b/include/linux/arm-smccc.h >>> @@ -150,5 +150,162 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, >>> >>> #define arm_smccc_hvc_quirk(...) __arm_smccc_hvc(__VA_ARGS__) >>> >>> +/* SMCCC v1.1 implementation madness follows */ >>> +#ifdef CONFIG_ARM64 >>> + >>> +#define SMCCC_SMC_INST "smc #0" >>> +#define SMCCC_HVC_INST "hvc #0" >>> + >>> +#define __arm_smccc_1_1_prologue(inst) \ >>> + inst "\n" \ >>> + "cbz %[ptr], 1f\n" \ >>> + "stp %x[r0], %x[r1], %[ra0]\n" \ >>> + "stp %x[r2], %x[r3], %[ra2]\n" \ >>> + "1:\n" \ >>> + : [ra0] "=Ump" (*(&___res->a0)), \ >>> + [ra2] "=Ump" (*(&___res->a2)), >>> + >>> +#define __arm_smccc_1_1_epilogue : "memory" >>> + >>> +#endif >>> + >>> +#ifdef CONFIG_ARM >>> +#include <asm/opcodes-sec.h> >>> +#include <asm/opcodes-virt.h> >>> + >>> +#define SMCCC_SMC_INST __SMC(0) >>> +#define SMCCC_HVC_INST __HVC(0) >>> + >>> +#define __arm_smccc_1_1_prologue(inst) \ >>> + inst "\n" \ >>> + "cmp %[ptr], #0\n" \ >>> + "stmne %[ptr], {%[r0], %[r1], %[r2], %[r3]}\n" \ >>> + : "=m" (*___res), >>> + >>> +#define __arm_smccc_1_1_epilogue : "memory", "cc" >>> + >>> +#endif >>> + >>> +#define __constraint_write_0 \ >>> + [r0] "+r" (r0), [r1] "=r" (r1), [r2] "=r" (r2), [r3] "=r" (r3) >>> +#define __constraint_write_1 \ >>> + [r0] "+r" (r0), [r1] "+r" (r1), [r2] "=r" (r2), [r3] "=r" (r3) >>> +#define __constraint_write_2 \ >>> + [r0] "+r" (r0), [r1] "+r" (r1), [r2] "+r" (r2), [r3] "=r" (r3) >>> +#define __constraint_write_3 \ >>> + [r0] "+r" (r0), [r1] "+r" (r1), [r2] "+r" (r2), [r3] "+r" (r3) >> >> It seems you need +r for all arguments, otherwise the compiler will >> notice that the value is never used, and may assign the register to >> 'res' instead, i.e., >> >> 3e4: 320107e0 mov w0, #0x80000001 // #-2147483647 >> 3e8: 320183e1 mov w1, #0x80008000 // #-2147450880 >> 3ec: 910123a2 add x2, x29, #0x48 >> 3f0: d4000002 hvc #0x0 >> 3f4: b4000062 cbz x2, 400 <enable_psci_bp_hardening+0x88> >> 3f8: a90487a0 stp x0, x1, [x29, #72] >> 3fc: a9058fa2 stp x2, x3, [x29, #88] >> >> (for the code generated in the next patch) > > Well spotted. > > I think this is because of the lack of early-clobber for the unassigned > registers. The compiler assumes the whole sequence is a single > instruction, with the output registers being affected at the end. If we > mark those with '=&r', we will prevent GCC from emitting this kind of > horror. > Tried that, actually, but it still doesn't help. It simply notices that r2 (in this case) is not referenced after the asm () [nor before] and so it simply seems to forget all about it, and reassigns it to something else. > Note that with Robin's trick of moving the assignment back to C code, > this is a bit moot as this is really a single instruction (smc/hvc), and > there is no intermediate register evaluation. > Yeah, that does look much better. _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm