Jan Hubicka wrote: > > > > > As far as experimentation is concerned, let me give some background about > > what I am doing and what kind of input I might be able to provide: > > > > I am doing research on optimal superblock scheduling and I need to import > > superblocks (more precisely, superblock data dependence graphs) from gcc to > > run them through my optimal solver (my research group currently has a way > > to import basic blocks and I am trying to extend that to superblocks). > > Even though this optimal solver is currently too slow to be included in a > > production compiler like gcc, it will be useful for studying the quality > > of gcc's schedules by comparing them against optimal. It will probably > > take me two or three months to get to that point for the very simplistic > > machine models that we are working with, but I'll be more than happy to > > provide you with any interesting results that I might come up with. > > Yes, this looks interesting. I would be curious how much gap there is > in between current scheduling and optimal one deifnitly. > On GCC mainline, you may also want to experiment with multipass > scheduling that (to my understanding) given large enought limits should > produce optimal schedules, but this is Vladimir's domain. > Actually the first cycle multipass instruction scheduling tries only to maximize number of insns issued on the current cycle. An optimal insn scheduling should minimize execution time of scheduled code which means it might require to issue not maximal number of insns on a cycle. Also the first cycle multipass insn scheduling solves the task in a primitive way (just trying part of permutations of ready insns without reusing already evaluated information from other tried permutation). This algorithm was just a demonstration of fast work DFA hazard recognizer. It improves code for Itanium (as I remember about 2% for SPECINT2000). As for optimal insn scheduling, I wrote a prototype for BB 2-3 years ago. It was based on a dynamic programing approach (reusing already evaluated information about scheduled insn subsequences). It worked fast for sequences up to 15-20 insns for ultrasparc. I see some improvements (say 7 cycles instead of 9 cycles) for about 30% of bbs on small benchmarks. I see the following problem with this approach for gcc as an industrial compiler: 1. Gcc has a lot of scheduler hooks which are oriented to list insn scheduling algorithm. Many ports uses them. It is practically impossible to embed these hooks into an optimal insn scheduling algorithm. 2. Long sequences of insns should be divided to keep the algorithm speed acceptable. Simple approach to this might work even worse than list insn scheduling. Some information (like when data from previous subsequence will be ready) should be transferred from one subsequence to another sequence. 3. Modern insn scheduling algorithms are now not only about insn rearranging (in linear code) but can include many other transformations like code cloning, predication, data/control flow speculation. I don't see how these transformations could be a part of an optimal insn scheduling). And I think there is more code improving potential using these transformations than an optimal scheduling. Although there is definitely some areas where an optimal scheduling could be useful. In any case, it would be very interesting to see the results of your research. We should encourage people to make researches on gcc. Vlad