Re: ptrlist-iterator performance on one wine source file

Luc Van Oostenryck <luc.vanoostenryck@xxxxxxxxx> · Sat, 29 Jul 2017 23:47:57 +0200

On Sat, Jul 29, 2017 at 9:33 PM, Christopher Li <sparse@xxxxxxxxxxx> wrote:
> On Sat, Jul 29, 2017 at 12:35 PM, Luc Van Oostenryck
> <luc.vanoostenryck@xxxxxxxxx> wrote:
>>>> You can't do that once cycles are involved. You need something
>>>> like the marking algorithm used by kill_unreachable_bbs() for that.
>>>> And it was such a  cycle that created the problem with the false
>>>> "crazy programmer" warning.
>>>
>>> No I don't think so. The find dominator already taking the cycles into
>>> account. By definition if X dominate Y, means every execution flow
>>> from entry point to Y will need to go through X. If X was not reachable,
>>> nor does Y. It does not change where the block get deleted. It just don't
>>> not need to do the marking algorithm. That is the point of dominator tree.
>>
>> OK, I misread and misunderstood that you was talking about  the
>> dominator *tree*.
>
> I was confused after your misunderstood as well. Well for a few minutes.
> The dominator tree does not have back edges. So if the edge on CFG
> we are removing is matching the edge on dominator tree. In other words,
> we are removing an edge from the dominator tree,  the whole branch
> of that tree can be removed as dead code.

Mmmm, interesting.

>> The real problem with such a tree is that you need to maintain it as
>> it potentially changes each time there is a change in the CFG.
>
> Right. So far most of the case is removing edges and blocks. That should
> be relative trivial to update the dominator tree. I haven't try it so we will
> see.
>
> Do we have a case actually *add* edge or block to the CFG?
> I don't recall one.

We do, more or less.
Once the code is linearized and inlining is done we:
- never add new BBs
- remove some BBs (some removing edges from the CFG)
- do several kinds of branch simplification (that's moving
  edge, so technically it's adding edge to the CFG, not sure it
  change the dom tree though).

>> And of course, building this tree is not linear (in the number of BBs)
>> while finding the dead BBs is linear.
> Find one round of dead BBs is linear, N number of BBs. Right now we
> mark and sweep the CFG for every node removed, total of M dead nodes.
> Then the total is M*N.

Yes, but this calculation is not correct at all.
- each time a node is removed, the total number of nodes is smaller
  and so the next time is a bit faster (this would correspond to a factor
  of 2)
- much more importantly, each time kill_unreachable_bbs() is called
  *all* the currently dead BBs are removed at once. So single call
  can kill several BBs. Of course, it will be different for each CFG/input
  files.

> In the memops finding dominating store is doing a lot worse. That is
> why gcc complete that file almost instantly. Sparse takes 30 seconds
> on my machine. One big problem is it did not cache the dominating
> result. It is redoing the finding again and again.

Uh?
Which input file your talking about?

BTW, two months ago or so, I posted a sort of mini-report about
quadratic behaviour with the current code. This quadratic behaviour
doesn't exist with the series I have for the misplaced phi-nodes.

> The algorithm I am using is this one:
>
> Cooper, Keith D.; Harvey, Timothy J.; and Kennedy, Ken (2001).
> A Simple, Fast Dominance Algorithm
> https://www.cs.rice.edu/~keith/EMBED/dom.pdf
>
> It is O(N^2), in practice it is faster than  Lengauer and Tarjan
> which is O(E *log(N)).
>
> Most of the sparse source code I try converge within 3 rounds.
> For practical purpose it feels like linear.

*smile* Feels like linear?
Did you try with several input files, some with big functions?

> Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html