On Fri, Nov 06, 2020 at 02:46:55PM -0800, Linus Torvalds wrote: > On Fri, Nov 6, 2020 at 2:19 PM Luc Van Oostenryck > <luc.vanoostenryck@xxxxxxxxx> wrote: > > > > On Fri, Nov 06, 2020 at 11:44:00AM -0800, Linus Torvalds wrote: > > > On Fri, Nov 6, 2020 at 11:05 AM Song Liu <songliubraving@xxxxxx> wrote: > > > (a) the conditional following for the return value: > > > > > > select.32 %r4 <- %arg1, $0xffffffff, $0 > > > select.32 %r6 <- %r4, $0xffffffff, $0 > > > > > > Note how it doesn't trigger CSE, because it's not the same instruction > > > (%arg1 vs %r4), but sparse *could* see that a select that depends on a > > > previous select that has constant arguments can be simplified to be > > > conditional on the original select value instead, so it *could* be > > > CSE'd into one single thing. > > > > > > So the second "select" could be optimized away by realizing that it > > > really gets the same value as the first one. That *should* be trivial > > > to do in sparse by just having a simple pattern for select > > > simplification. > > > > > > And once that optimization is in place, the second optimization would be > > > > I think I may even have already this in an half-polished form. > > Ugh. I tried to see what I can do, and it does simplify, but not the > way I expected. > > The attached patch looks superficially sane (but probably is broken), > and results in one less "select" statement: > > > .L0: > <entry-point> > cbr %arg1, .L5, .L4 > > .L4: > call spin_lock > br .L5 > > .L5: > # call %r4 <- errno, %r1 > select.32 %r6 <- %arg1, $0xffffffff, $0 > cbr %arg1, .L6, .L2 > > .L2: > call spin_unlock > br .L6 > > .L6: > ret.32 %r6 > > but note how it removed not the select statement I expected, so you > don't actually end up with a branch to a branch anyway. > > Oh well. It's close. That "select" could be anywhere, so the branch > following could know that. Yes, the core of the problem is to move things up. Here is a patch that kinda do this, a bit too much even but the idea is there. With is it gives the following: cmp: .L0: <entry-point> select.32 %r4 <- %arg1, $0xffffffff, $0 select.32 %r6 <- %r4, $0xffffffff, $0 cbr %arg1, .L6, .L4 .L4: call spin_lock # call %r4 <- errno, %r1 call spin_unlock br .L6 .L6: ret.32 %r6 The patch move *all* selects the highest possible using the dominance tree. It complements the select simplification. The only part really interesting is the very last one, the remaining is just infrastructure. So, I think that the next questions are: * which selects should be moved up? * up to where should them be moved? I'll think a bit all this this WE. -- Luc >From 7c0d5a759756989919eb2799e7e50d36941d47b3 Mon Sep 17 00:00:00 2001 From: Luc Van Oostenryck <luc.vanoostenryck@xxxxxxxxx> Date: Sat, 7 Nov 2020 02:23:21 +0100 Subject: [PATCH] EXP: move up all selects at most as possible --- linearize.c | 21 +++++++++++++++++++++ linearize.h | 1 + simplify.c | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 56 insertions(+) diff --git a/linearize.c b/linearize.c index c1e3455adb67..005d90b3fb45 100644 --- a/linearize.c +++ b/linearize.c @@ -737,6 +737,27 @@ void insert_select(struct basic_block *bb, struct instruction *br, struct instru add_instruction(&bb->insns, br); } +void copy_select(struct basic_block *bb, struct instruction *insn) +{ + struct instruction *br; + pseudo_t target; + struct instruction *select; + + /* Remove the 'br' */ + br = delete_last_instruction(&bb->insns); + + select = alloc_typed_instruction(OP_SEL, insn->type); + select->bb = bb; + + use_pseudo(select, insn->src1, &select->src1); + use_pseudo(select, insn->src2, &select->src2); + use_pseudo(select, insn->src3, &select->src3); + select->target = insn->target; + + add_instruction(&bb->insns, select); + add_instruction(&bb->insns, br); +} + static inline int bb_empty(struct basic_block *bb) { return !bb->insns; diff --git a/linearize.h b/linearize.h index 77ae7c9ac976..cc38fe07c974 100644 --- a/linearize.h +++ b/linearize.h @@ -311,6 +311,7 @@ struct entrypoint { }; extern void insert_select(struct basic_block *bb, struct instruction *br, struct instruction *phi, pseudo_t if_true, pseudo_t if_false); +extern void copy_select(struct basic_block *bb, struct instruction *br); extern void insert_branch(struct basic_block *bb, struct instruction *br, struct basic_block *target); struct instruction *alloc_phisrc(pseudo_t pseudo, struct symbol *type); diff --git a/simplify.c b/simplify.c index f2aaa52de84b..633ebc6fad2c 100644 --- a/simplify.c +++ b/simplify.c @@ -1745,8 +1745,24 @@ static int simplify_cast(struct instruction *insn) return 0; } +#include "flowgraph.h" +static int is_defined_at(pseudo_t src, struct basic_block *bb) +{ + if (src->type != PSEUDO_REG) + return 1; + return domtree_dominates(bb, src->def->bb); +} + +static int move_select_to(struct instruction *insn, struct basic_block *bb) +{ + copy_select(bb, insn); + kill_instruction(insn); + return REPEAT_CSE; +} + static int simplify_select(struct instruction *insn) { + struct basic_block *bb; pseudo_t cond, src1, src2; cond = insn->src1; @@ -1782,6 +1798,24 @@ static int simplify_select(struct instruction *insn) kill_use(&insn->src3); return replace_with_value(insn, 0); } + + // Try to move this select 'up'. + bb = insn->bb; + while (bb->idom) { + bb = bb->idom; + if (bb == insn->bb) + goto out; + if (!is_defined_at(cond, bb)) + goto out; + if (!is_defined_at(src1, bb)) + goto out; + if (!is_defined_at(src2, bb)) + goto out; + } + if (bb != insn->bb) + return move_select_to(insn, bb); + +out: return 0; } -- 2.29.2