Ghassan Shobaki wrote:
Thanks for the article, Ghassan. But I don't think this is the first optimal algorithm as you wrote in your introdauction. I implemented an analogous approach about seven years ago. And I am sure I am not the first.Hi,
I have previously studied superblock scheduling by importing superblock DAGs from gcc and feeding them into my scheduler (I have published my results at Micro in case you are interested: http://www.microarch.org/micro37/papers/25_Shobaki-Superblock.pdf)
The problem with the optimal algorithms is that they describe always simplified models. For example, the real register allocation is not only assigning registers. It is also coalescing, spilling, rematerialization, register elimination, sometimes code selection etc. The insn scheduling is not only rearranging insns but also partial register renaming and forward substitution, insn cloning, sometimes predication, insn mutation, code unification, supporting data/control speculation hardware, code unification etc.
It can not be decribed by a resonable model for optimal solution. IMHO usage of metaheuristic (like simulated annealing, taboo search, genetic aproaches etc) to get a better solution of real problems is more productive way. They work fine for real problem like chip design, tracing and placements in board designing.
In any case, I'll read your article with a great attention. It will be a joy to me to compare your approach with mine.
If you could manage integrate your code and approach into gcc for its real usage, that would be great. That would be the first industrial compiler using such approach.
Now, I wonder if I can do the same thing to traces by importing the DAGs and the necessary data-flow information from gcc. Unlike the superblock, which has a single entry and multipl exits, a trace is a more general structure that can have multiple entries and multiple exits. So, the question is: Can I extract traces for my experiments from gcc?Sorry, I don't know this code well. I am in the same position as you. I should research the code to answer your questions. I think Jan is the right person to do it. Because he wrote this code.
My understanding is that since version 3.4 gcc offers these two command-line options:
-fsched2-use-superblocks: This does extended basic block scheduling; no trace formation or tail duplication. It just schedules the extended basic blocks that appear naturally in the code.
-fsched2-use-traces: This does superblock scheduling, i.e. it forms traces then it does tail duplication to eliminate side entrances.
I am right about the functionalities of these options?
It seems to me that none of these options meets my objective of studying traces that have mltiple entrances and multiple exits. Please correct me if I am wrong. And if I am right, is there another way for me to get what I want, may be by doing some quick hack that bypasses tail duplication under the -fsched2-use-tarces option -does this make sense?
Vlad