Re: Question on peephole2 optimizer

Jeff Law <law@xxxxxxxxxx> · Tue, 04 Feb 2020 10:09:04 -0700



On Tue, 2020-02-04 at 10:44 +0100, Henri Cloetens wrote:
> Hello Richard, Jeff,
> 
> I checked both. 
> - The aarch64 backend uses the load double only for stacking operations.
>   This, I have. This functionality is provided by gcc via the load_multiple construct.
>   If you define it, gcc will use it for stack and unstack operations.
> - The ARM has a peephole2 optimizer. This has the problem that it is run after the
>   register allocation, and if the register allocation needs to change for the optimization
>   to be done, the pattern fails. I tried that, I got that working, but ... I dont like it.
> - I found another way, which would work in theory:
>   a. Add it to the "movsi" 
>     1. Make a "define_expand" of the movsi, which does the following:
>       a. For the 'normal' case, it calls a define_insn "movsi_internal"
>       b. It maintains a per-function history of past calls to self.
>       c. For every call to movsi, it looks in the history if it finds a 'partner'
>           with which it can create a "load double"
>       d. If it finds one, it starts going back in the insn-list, and do checking
>           if the replacement is appropriate. It mainly means no in-between
>           jumps and labels, no in-between modification of the address register,
>           not too far back.
>       e. If the checking is successful, it replaces the previous movsi with the load double.
> 
> For now, I will park this, and do as in aarch64. I might try it later. 
Trying to do this before register allocation isn't going to work the
way you want.   But, well, good luck.

jeff
>