On Wed, Aug 29, 2007 at 12:11:22PM +0100, Darryl Miles wrote: > Rask Ingemann Lambertsen wrote: > >On Tue, Aug 28, 2007 at 11:02:49PM +0100, Darryl Miles wrote: > > Peephole definitions check for cases like this and won't do the > >optimization clobbering the flags register if the flags register is live at > >that point. > > So I take it that peephole works by knowing the instructions emitted > with annotations about the lifetimes of registers / flags / other useful > stuff to help it. I was thinking it was a bit more blind to things than > that. The peephole2 pass tries to match one or more instructions in GCC's RTL format against a template, and if successful, replaces those instructions by one or more new ones. You can include a condition for the replacement as well as requiring a scratch register to be available. In this particular case, the condition is that the flags registers isn't live. > I did not understand the relevance to knowing if it is (*movsi_xxx) or > (*movdi_xxx). From my point of view knowing that would not alter the 2 > original points I was making [1] and [3]. Maybe there is some > pipelining (or other complex) issue I don't know about which makes the > emitted code better than what I'm suggesting. No, it's just that debugging problems with poor code, it's best to know exactly what the compiler thinks it's generating. > Recapping on the original issues: > > [1] failure to treat setting a register to the value of zero as a > special case (since there maybe many ways to achieve this on a given > CPU, different methods have different trades, insn length, unwanted side > effects) which may allow this operation a lot of freedom for moving / > scheduling. > [3] usage of %ebx when %r8d would have been a better choice, at the time > %ebx is needed to be allocated the lifetime of the temporary use of %r8d > was over. i.e. allocating of registers which form outputs but not > inputs should take place last thing (at the moment of #APP) maybe by > doing this %r8d would have been a candidate ? which would negate the > need for the push/pop's. GCC's register allocator isn't as good as we'd like it to be. I think it's causing both problems. > Thanks for your thoughts. Maybe I am just expecting too much. I don't think so. -- Rask Ingemann Lambertsen