Re: Register pressure reduction techniques on an accumulator machine

Jeff Law via Gcc-help <gcc-help@xxxxxxxxxxx> · Sat, 29 Jun 2024 06:55:49 -0600

On 6/29/24 5:35 AM, Filip Kokosiński via Gcc-help wrote:
Hi all,

I'm working on a custom backend for a very simple accumulator-based machine.
While the port mostly works on lower optimization settings (-O0, -O1), it tends
to fail when trying to spill registers on higher ones (-O2, -O3, -Os):

   main.c:49:1: error: unable to find a register to spill
      49 | }
         | ^
   main.c:49:1: error: this is the insn:
   (insn 1341 1350 245 44 (set (reg:QI 95 [ ivtmp.29 ])
           (reg:QI 852 [851])) "main.c":45:5 3 {movqi}
        (expr_list:REG_DEAD (reg:QI 852 [851])
           (nil)))
   during RTL pass: reload
   main.c:49:1: internal compiler error: in lra_split_hard_reg_for, at
lra-assigns.cc:1868

Most of the instructions are defined in the following way:

   (define_insn "addqi3"
     [(set (match_operand:QI 0 "register_operand" "=a")
   (plus:QI
    (match_operand:QI 1 "register_operand" "0")
    (match_operand:QI 2 "nonimmediate_operand" "rm")))]
     ""
     "add\\t%2")

Where the `a` constraint is a register constraint for the single register
accumulator class.

I can get rid off this error and produce working code by using the
`-fno-inline-small-functions`, but I was wondering if there are some other
places I could tinker with.

Here are the approaches I've tried so far to reduce the pressure on the
accumulator register:

1. Defining the `TARGET_SPILL_CLASS` hook, and return `GENERAL_REGS` when
    `rclass` is the accumulator register class. My target allows for indirect
    addressing using memory cells, so I've defined a set of 16 "pseudo"
registers +
    SP. I was hoping that some spilling would happen from the accumulator
    register to the in-memory registers, removing the need for some
secondary reloads,
    but alas this didn't help much.

2. Defining the `TARGET_SCHED_*` family of hooks - these ones are tricky, and
    my understanding of them is lacking. I've tried to just recreate the
    approach that the SuperH backend uses (R0 register pressure), but this
    approach didn't help me either.

 From reading the mailing list I gathered that GCC might not be best suited for
single-operand machines, but I'm eager to know if there are techniques I could
try employing, or if there are backends that I've missed but which had already
dealt with this problem.

One thing that I've also considered is hiding the accumulator register from the
compiler altogether. This idea feels like a nuclear option, though, and I'm
kind of reluctant to try it.
-f flags may work around the issue in some cases, but they are not a 
real solution.  Similarly TARGET_SCHED_* are options to improve 
scheduling for the target, again, they may work around the problem in 
some cases, but they're not a real solution.

Hiding the accumulator may be the best option.  It's often tough to do 
allocation of singleton register classes, particularly if they're 
heavily used.

You might consider looking at the rl78 port which has a limited register 
file.  Essentially it uses a virtual register file to provide the 
allocator and reloading phase with sensible enough target.  An rl78 
specific pass after register allocation then maps those virtuals down to 
the physical register file.
Jeff