On 2/28/19 1:59 AM, Jiong Wang wrote: > v1->v2: > - Simplified code logic inside hasLiveDefs. (Yonghong) > - Updated testcase insn-unit.s (Yonghong) > > BPF XADD semantics require all Defs of XADD are dead, meaning any result of > XADD insn is not used. > > However, BPF backend hasn't enabled sub-register liveness track, so when > the source and destination operands of XADD are GPR32, there is no > sub-register dead info. If we rely on the generic > MachineInstr::allDefsAreDead, then we will raise false alarm on GPR32 Def. > > This was fine as there was no sub-register code-gen support for XADD, but > it will be added by the second patch. Without proper handling of GPR32 Def, > testcase like the one added in the second patch will fail on sub-register > code-gen mode. > > To support GPR32 Def, ideally we could just enable sub-registr liveness > track on BPF backend, then allDefsAreDead could work on GPR32 Def. This > requires implementing TargetSubtargetInfo::enableSubRegLiveness on BPF. > > However, sub-register liveness tracking module inside LLVM is actually > designed for the situation where one register could be split into more than > one sub-registers for which case each sub-register could have their own > liveness and kill one of them doesn't kill others. So, tracking liveness > for each make sense. > > For BPF, each 64-bit register could only have one 32-bit sub-register. This > is exactly the case which LLVM think brings no benefits for doing > sub-register tracking, because the live range of sub-register must always > equal to its parent register, therefore liveness tracking is disabled even > the back-end has implemented enableSubRegLiveness. The detailed information > is at r232695: > > Author: Matthias Braun <matze@xxxxxxxxxx> > Date: Thu Mar 19 00:21:58 2015 +0000 > Do not track subregister liveness when it brings no benefits > > Hence, for BPF, we enhance MachineInstr::allDefsAreDead. Given the solo > sub-register always has the same liveness as its parent register, LLVM is > already attaching a implicit 64-bit register Def whenever the there is > a sub-register Def. The liveness of the implicit 64-bit Def is available. > For example, for "lock *(u32 *)(r0 + 4) += w9", the MachineOperand info > could be: > > $w9 = XADDW32 killed $r0, 4, $w9(tied-def 0), > implicit killed $r9, implicit-def dead $r9 > > Even though w9 is not marked as Dead, the parent register r9 is marked as > Dead correctly, and it is safe to use such information or our purpose. > > After above improvements, the second patch in this set then adds > sub-register code-gen support for XADD. > > Jiong Wang (3): > bpf: improve dead Defs check for XADD > bpf: enable sub-register code-gen for XADD > bpf: disassembler support for XADD under sub-register mode The change looks good. Tested with unit tests and kernel selftests. I added something like below to one of test_progs bpf programs: (void)__sync_fetch_and_add(value_p, 1); and the generated code is correct for alu32 mode and the objdump is able to disassemble correctly with -mattr=+alu32. You can add my ack: Acked-by: Yonghong Song <yhs@xxxxxx> Could you go ahead to push to llvm trunk? Thanks! > > lib/Target/BPF/BPFInstrInfo.td | 28 ++++++- > lib/Target/BPF/BPFMIChecking.cpp | 97 ++++++++++++++++++++++++- > lib/Target/BPF/Disassembler/BPFDisassembler.cpp | 3 +- > test/CodeGen/BPF/xadd.ll | 2 + > test/CodeGen/BPF/xadd_legal.ll | 26 +++++++ > test/MC/BPF/insn-unit.s | 3 +- > test/MC/BPF/load-store-32.s | 3 + > 7 files changed, 154 insertions(+), 8 deletions(-) > create mode 100644 test/CodeGen/BPF/xadd_legal.ll >