On Sat, Sep 23, 2017 at 1:41 AM, Jakub Kicinski via iovisor-dev <iovisor-dev@xxxxxxxxxxxxxxxxx> wrote: > On Fri, 22 Sep 2017 22:03:47 -0700, Yonghong Song wrote: >> On 9/22/17 9:24 AM, Jakub Kicinski wrote: >> > On Thu, 21 Sep 2017 11:56:55 -0700, Alexei Starovoitov wrote: >> >> On Wed, Sep 20, 2017 at 12:20:40AM +0100, Jiong Wang via iovisor-dev wrote: >> >>> On 18/09/2017 22:29, Daniel Borkmann wrote: >> >>>> On 09/18/2017 10:47 PM, Jiong Wang wrote: >> >>>>> Hi, >> >>>>> >> >>>>> Currently, LLVM eBPF backend always generate code in 64-bit mode, >> >>>>> this may >> >>>>> cause troubles when JITing to 32-bit targets. >> >>>>> >> >>>>> For example, it is quite common for XDP eBPF program to access >> >>>>> some packet >> >>>>> fields through base + offset that the default eBPF will generate >> >>>>> BPF_ALU64 for >> >>>>> the address formation, later when JITing to 32-bit hardware, >> >>>>> BPF_ALU64 needs >> >>>>> to be expanded into 32 bit ALU sequences even though the address >> >>>>> space is >> >>>>> 32-bit that the high bits is not significant. >> >>>>> >> >>>>> While a complete 32-bit mode implemention may need an new ABI >> >>>>> (something like >> >>>>> -target-abi=ilp32), this patch set first add some initial code so we >> >>>>> could >> >>>>> construct 32-bit eBPF tests through hand-written assembly. >> >>>>> >> >>>>> A new 32-bit register set is introduced, its name is with "w" >> >>>>> prefix and LLVM >> >>>>> assembler will encode statements like "w1 += w2" into the following >> >>>>> 8-bit code >> >>>>> field: >> >>>>> >> >>>>> BPF_ADD | BPF_X | BPF_ALU >> >>>>> >> >>>>> BPF_ALU will be used instead of BPF_ALU64. >> >>>>> >> >>>>> NOTE, currently you can only use "w" register with ALU >> >>>>> statements, not with >> >>>>> others like branches etc as they don't have different encoding for >> >>>>> 32-bit >> >>>>> target. >> >>>> >> >>>> Great to see work in this direction! Can we also enable to use / emit >> >>>> all the 32bit BPF_ALU instructions whenever possible for the currently >> >>>> available bpf targets while at it (which only use BPF_ALU64 right now)? >> >>> >> >>> Hi Daniel, >> >>> >> >>> Thanks for the feedback. >> >>> >> >>> I think we could also enable the use of all the 32bit BPF_ALU under >> >>> currently >> >>> available bpf targets. As we now have 32bit register set support, we could >> >>> make >> >>> i32 type as legal type to prevent it be promoted into i64, then hook it up >> >>> with i32 >> >>> ALU patterns, will look into this. >> >> >> >> I don't think we need to gate 32bit alu generation with a flag. >> >> Though interpreter and JITs support 32-bit since day one, the verifier >> >> never seen such programs before, so some valid programs may get >> >> rejected. After some time passes and we're sure that all progs >> >> still work fine when they're optimized with 32-bit alu, we can flip >> >> the switch in llvm and make it default. >> > >> > Thinking about next steps - do we expect the 32b operations to clear the >> > upper halves of the registers? The interpreter does it, and so does >> > x86. I don't think we can load 32bit-only programs on 64bit hosts, so >> > we would need some form of data flow analysis in the kernel to prune >> > the zeroing for 32bit offload targets. Is that correct? >> >> Could you contrive an example to show the problem? If I understand >> correctly, you most worried that some natural sign extension is gone >> with "clearing the upper 32-bit register" and such clearing may make >> some operation, esp. memory operation not correct in 64-bit machine? > > Hm. Perhaps it's a blunder on my side, but let's take: > > r1 = ~0ULL > w1 = 0 > # use r1 > > on x86 and the interpreter, the w1 = 0 will clear upper 32bits, so r1 > ends up as 0. 32b arches may translate this to something like: > > # r1 = ~0ULL > r1.lo = ~0 > r1.hi = ~0 > # w1 = 0 > r1.lo = 0 > # r1.hi not touched > > which will obviously result in r1 == 0xffffffff00000000. LLVM should > not assume r1.hi is cleared, but I'm not sure this is a strong enough > argument. Not sure what LLVM will do in this case for later "r1" access unless going through the real implementation. My hunch is LLVM should do a conversion from 32bit to 64bit, "r1 <<= 32" and "r1 >>= 32" after "w1 = 0" before using r1. Let us wait and check once implementation in place. > _______________________________________________ > iovisor-dev mailing list > iovisor-dev@xxxxxxxxxxxxxxxxx > https://lists.iovisor.org/mailman/listinfo/iovisor-dev