On Sat, Aug 27, 2011 at 8:37 AM, Jeff Garzik <jeff@xxxxxxxxxx> wrote: > > On our point of view, we probably prefer to simply turn off as many > transformations as possible. They just waste time, when an optimizing LLVM > backend is going to perform the same transformations anyway. I disagree - mainly because I don't think we're interested in the back end, are we? If we were doing LLVM hacking, then I'd agree. But as it is, we're supposed to improve sparse, not LLVM, so we should make sure that the _sparse_ output makes sense, and LLVM is just a code generator, no? Also, I suspect LLVM has an easier time of it if we were to generate straightforward code rather than the extra basic blocks we do now. So with the attached patch, your test-case turns into t.c:1:5: warning: symbol 'foo' was not declared. Should it be static? foo: .L0x7fc91affa010: <entry-point> phisrc.32 %phi2(x) <- %arg1 phisrc.32 %phi4(x) <- %arg1 phisrc.32 %phi6(i) <- $0 br .L0x7fc91affa058 .L0x7fc91affa058: phi.32 %r3 <- %phi4(x), %phi5(x) add.32 %r5 <- %r3, $42 phisrc.32 %phi3(x) <- %r5 phisrc.32 %phi5(x) <- %r5 phi.32 %r7 <- %phi6(i), %phi7(i) add.32 %r8 <- %r7, $1 setlt.32 %r10 <- %r8, $10 phisrc.32 %phi7(i) <- %r8 br %r10, .L0x7fc91affa058, .L0x7fc91affa130 .L0x7fc91affa130: ret.32 %r3 which looks messy due to all the phi's (that we don't combine - the patch includes Kamil's "don't combine" thing), but looks simpler otherwise. Now sparse has optimized away the entry conditional (simply because it got linearized separately and then the optimization was a trivial constant one). phi2/phi3 are both dead, but theor 'phisrc' instructions haven't been killed. Ugly. The attached patch is ENTIRELY UNTESTED. The only thing I tested it on is your test-case. Running "make test" requires stuff that I don't even have installed, and I'm lazy ;^o Linus
cse.c | 7 ------- linearize.c | 14 +++++--------- 2 files changed, 5 insertions(+), 16 deletions(-) diff --git a/cse.c b/cse.c index 2a1574531993..2aabb65785f0 100644 --- a/cse.c +++ b/cse.c @@ -317,13 +317,6 @@ static struct instruction * try_to_cse(struct entrypoint *ep, struct instruction b2 = i2->bb; /* - * PHI-nodes do not care where they are - the only thing that matters - * are the PHI _sources_. - */ - if (i1->opcode == OP_PHI) - return cse_one_instruction(i1, i2); - - /* * Currently we only handle the uninteresting degenerate case where * the CSE is inside one basic-block. */ diff --git a/linearize.c b/linearize.c index f2034ce93572..06128ed5b5ee 100644 --- a/linearize.c +++ b/linearize.c @@ -2060,16 +2060,10 @@ pseudo_t linearize_statement(struct entrypoint *ep, struct statement *stmt) concat_symbol_list(stmt->iterator_syms, &ep->syms); linearize_statement(ep, pre_statement); - loop_body = loop_top = alloc_basic_block(ep, stmt->pos); + loop_body = alloc_basic_block(ep, stmt->pos); loop_continue = alloc_basic_block(ep, stmt->pos); loop_end = alloc_basic_block(ep, stmt->pos); - /* An empty post-condition means that it's the same as the pre-condition */ - if (!post_condition) { - loop_top = alloc_basic_block(ep, stmt->pos); - set_activeblock(ep, loop_top); - } - if (pre_condition) linearize_cond_branch(ep, pre_condition, loop_body, loop_end); @@ -2082,10 +2076,12 @@ pseudo_t linearize_statement(struct entrypoint *ep, struct statement *stmt) set_activeblock(ep, loop_continue); linearize_statement(ep, post_statement); + + /* No post-condition means that it's the same as the pre-condition */ if (!post_condition) - add_goto(ep, loop_top); + linearize_cond_branch(ep, pre_condition, loop_body, loop_end); else - linearize_cond_branch(ep, post_condition, loop_top, loop_end); + linearize_cond_branch(ep, post_condition, loop_body, loop_end); set_activeblock(ep, loop_end); break; }