Re: [RFC] sparse SSA construction

Luc Van Oostenryck <luc.vanoostenryck@xxxxxxxxx> · Tue, 15 Aug 2017 22:14:00 +0200

On Tue, Aug 15, 2017 at 8:36 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> https://github.com/lucvoo/sparse/tree/sssa-mini-clean
>
> I like it. But the commit messages are sorely lacking. And I'm not
> talking just about the lack of sign-off's, I'm talking about the
> messages just not explaining what is going on AT ALL.

I know, sorry for that, I don't like it myself (often I've the
impression to spend more time to wrote a decent commit
message than to write the corresponding code).
I'll work on that tomorrow (Europe here, already evening, sorry).

One of the thing I'm afraid won't be able to do (without a quite
a bit more work) is to do some nice small step-by-step commits.
It's completely changing the method, a step-by-step will be very
artificial and quite useless, I think.

> But with that fixed, I heartily recommend merging the new ssa
> construction code. It's based on a real paper, rather than being
> ad-hoc.

It's just too bad that the paper simply ignores gotos. The method
is really cool, fast, simple and clean, I love it. But I had to add
a few things, more akin to kludges than hacks IMO, to make it
work with gotos. But as is, it works pretty well, is fast and produce
correct SSA (also avoid few quadratic behaviours the current
simplify_one_symbol() has).

Improvement areas are:
- I feel we can do better with gotos and computed gotos
  it needs investigation
- the ptrmap thing need an hash based implementation
  but this is only needed for extreme cases. The trick will
  be to do something with dynamic size and which will not
  waste (too much)  memory. Nothing hard, but it will need
  some experiment  and I want first to collect some statistics
- there is an optimization that can be added for (what I call)
  'complex trivial phi-nodes' (it's clear in the paper, I think)

But none of this is blocking, just something that can easily
be added later.

The current situation is:
- As far as I saw the generated IR is correct (from a phi-node
  point of view).
- The performance is pretty nice (faster than the current method)
- There are some optimizations for free because the method
  do a kind of global-numbering.
- It does very well on a kernel allyesconfig. It just creates
  something like 8 or 12 more context/locking warnings I
  would like to understand (OTOH, knowing how broken
  the current SSA is, I'm surprised there isn't more differences).

For information, the current SSA construction is very broken,
both in simplify_one_symbol() (the main SSA construction)
and simplify_loads() (which also creates phi-nodes). They both
are 'load-driven' which is OK but they both put the phi-nodes
where the load is while phi-nodes need to placed where branches
meet. The loads are often placed in a BB following where the
join-point. Often this *seems* to be OK but in fact is very wrong
and create all sort of problems a bit everywhere

Best regards,
-- Luc Van Oostenryck
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html