Re: linearize bug?

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Sat, 27 Aug 2011 08:53:23 -0700

On Sat, Aug 27, 2011 at 8:37 AM, Jeff Garzik <jeff@xxxxxxxxxx> wrote:
>
> On our point of view, we probably prefer to simply turn off as many
> transformations as possible.  They just waste time, when an optimizing LLVM
> backend is going to perform the same transformations anyway.

I disagree - mainly because I don't think we're interested in the back
end, are we?

If we were doing LLVM hacking, then I'd agree. But as it is, we're
supposed to improve sparse, not LLVM, so we should make sure that the
_sparse_ output makes sense, and LLVM is just a code generator, no?

Also, I suspect LLVM has an easier time of it if we were to generate
straightforward code rather than the extra basic blocks we do now.

So with the attached patch, your test-case turns into

  t.c:1:5: warning: symbol 'foo' was not declared. Should it be static?
  foo:
  .L0x7fc91affa010:
	<entry-point>
	phisrc.32   %phi2(x) <- %arg1
	phisrc.32   %phi4(x) <- %arg1
	phisrc.32   %phi6(i) <- $0
	br          .L0x7fc91affa058

  .L0x7fc91affa058:
	phi.32      %r3 <- %phi4(x), %phi5(x)
	add.32      %r5 <- %r3, $42
	phisrc.32   %phi3(x) <- %r5
	phisrc.32   %phi5(x) <- %r5
	phi.32      %r7 <- %phi6(i), %phi7(i)
	add.32      %r8 <- %r7, $1
	setlt.32    %r10 <- %r8, $10
	phisrc.32   %phi7(i) <- %r8
	br          %r10, .L0x7fc91affa058, .L0x7fc91affa130

  .L0x7fc91affa130:
	ret.32      %r3

which looks messy due to all the phi's (that we don't combine - the
patch includes Kamil's "don't combine" thing), but looks simpler
otherwise. Now sparse has optimized away the entry conditional (simply
because it got linearized separately and then the optimization was a
trivial constant one).

phi2/phi3 are both dead, but theor 'phisrc' instructions haven't been
killed. Ugly.

The attached patch is ENTIRELY UNTESTED. The only thing I tested it on
is your test-case. Running "make test" requires stuff that I don't
even have installed, and I'm lazy ;^o

                   Linus
 cse.c       |    7 -------
 linearize.c |   14 +++++---------
 2 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/cse.c b/cse.c
index 2a1574531993..2aabb65785f0 100644
--- a/cse.c
+++ b/cse.c
@@ -317,13 +317,6 @@ static struct instruction * try_to_cse(struct entrypoint *ep, struct instruction
 	b2 = i2->bb;
 
 	/*
-	 * PHI-nodes do not care where they are - the only thing that matters
-	 * are the PHI _sources_.
-	 */
-	if (i1->opcode == OP_PHI)
-		return cse_one_instruction(i1, i2);
-
-	/*
 	 * Currently we only handle the uninteresting degenerate case where
 	 * the CSE is inside one basic-block.
 	 */
diff --git a/linearize.c b/linearize.c
index f2034ce93572..06128ed5b5ee 100644
--- a/linearize.c
+++ b/linearize.c
@@ -2060,16 +2060,10 @@ pseudo_t linearize_statement(struct entrypoint *ep, struct statement *stmt)
 		concat_symbol_list(stmt->iterator_syms, &ep->syms);
 		linearize_statement(ep, pre_statement);
 
- 		loop_body = loop_top = alloc_basic_block(ep, stmt->pos);
+		loop_body = alloc_basic_block(ep, stmt->pos);
  		loop_continue = alloc_basic_block(ep, stmt->pos);
  		loop_end = alloc_basic_block(ep, stmt->pos);
  
-		/* An empty post-condition means that it's the same as the pre-condition */
-		if (!post_condition) {
-			loop_top = alloc_basic_block(ep, stmt->pos);
-			set_activeblock(ep, loop_top);
-		}
-
 		if (pre_condition) 
  			linearize_cond_branch(ep, pre_condition, loop_body, loop_end);
 
@@ -2082,10 +2076,12 @@ pseudo_t linearize_statement(struct entrypoint *ep, struct statement *stmt)
 
 		set_activeblock(ep, loop_continue);
 		linearize_statement(ep, post_statement);
+
+		/* No post-condition means that it's the same as the pre-condition */
 		if (!post_condition)
-			add_goto(ep, loop_top);
+			linearize_cond_branch(ep, pre_condition, loop_body, loop_end);
 		else
- 			linearize_cond_branch(ep, post_condition, loop_top, loop_end);
+			linearize_cond_branch(ep, post_condition, loop_body, loop_end);
 		set_activeblock(ep, loop_end);
 		break;
 	}