> On Thu, Jan 04, 2018 at 02:15:53AM +0000, Alan Cox wrote: > > > > > > Elena has done the work of auditing static analysis reports to a dozen > > > > or so locations that need some 'nospec' handling. > > > > > > How exactly is that related (especially in longer-term support terms) to > > > BPF anyway? > > > > If you read the papers you need a very specific construct in order to not > > only cause a speculative load of an address you choose but also to then > > manage to cause a second operation that in some way reveals bits of data > > or allows you to ask questions. > > > > BPF allows you to construct those sequences relatively easily and it's > > the one case where a user space application can fairly easily place code > > it wants to execute in the kernel. Without BPF you have to find the right > > construct in the kernel, prime all the right predictions and measure the > > result without getting killed off. There are places you can do that but > > they are not so easy and we don't (at this point) think there are that > > many. > > for BPF in particular we're thinking to do a different fix. > Instead of killing speculation we can let cpu speculate. > The fix will include rounding up bpf maps to nearest power of 2 and > inserting bpf_and operation on the index after bounds check, > so cpu can continue speculate beyond bounds check, but it will > load from zero-ed memory. > So this nospec arch dependent hack won't be necessary for bpf side, > but may still be needed in other parts of the kernel. I think this is a much better approach than what we have been doing internally so far. My initial fix back in August for this was to insert a minimal set of lfence barriers (osb() barrier below) that prevent speculation just based on how attack was using constants to index eBPF maps: diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c #define BPF_R0 regs[BPF_REG_0] @@ -939,6 +940,7 @@ static unsigned int ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, DST = IMM; CONT; LD_IMM_DW: + osb(); DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32; insn++; CONT; @@ -1200,6 +1202,7 @@ static unsigned int ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, *(SIZE *)(unsigned long) (DST + insn->off) = IMM; \ CONT; \ LDX_MEM_##SIZEOP: \ + osb(); \ DST = *(SIZE *)(unsigned long) (SRC + insn->off); \ CONT; And similar stuff also for x86 JIT (blinding dependent): @@ -400,7 +413,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, case BPF_ADD: b2 = 0x01; break; case BPF_SUB: b2 = 0x29; break; case BPF_AND: b2 = 0x21; break; - case BPF_OR: b2 = 0x09; break; + case BPF_OR: b2 = 0x09; emit_memory_barrier(&prog); break; case BPF_XOR: b2 = 0x31; break; } if (BPF_CLASS(insn->code) == BPF_ALU64) @@ -647,6 +660,16 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, case BPF_ALU64 | BPF_RSH | BPF_X: case BPF_ALU64 | BPF_ARSH | BPF_X: + /* If blinding is enabled, each + * BPF_LD | BPF_IMM | BPF_DW instruction + * is converted to 4 eBPF instructions with + * BPF_ALU64_IMM(BPF_LSH, BPF_REG_AX, 32) + * always present(number 3). Detect such cases + * and insert memory barriers. */ + if ((BPF_CLASS(insn->code) == BPF_ALU64) + && (BPF_OP(insn->code) == BPF_LSH) + && (src_reg == BPF_REG_AX)) + emit_memory_barrier(&prog); /* check for bad case when dst_reg == rcx */ if (dst_reg == BPF_REG_4) { /* mov r11, dst_reg */ But this is really ugly fix IMO to prevent speculation from happen, so with your approach I think it is really much better. If you need help in testing the fixes, I can do it unless you already have the attack code yourself. > > Also note that variant 1 is talking about exploiting prog_array > bpf feature which had 64-bit access prior to > commit 90caccdd8cc0 ("bpf: fix bpf_tail_call() x64 JIT") > That was a fix for JIT and not related to this cpu issue, but > I believe it breaks the existing exploit. Actually I could not get existing exploit working on anything past 4.11 but at that time I could not see any fundamental change in eBPF that would prevent it. Best Regards, Elena.