On 6/29/24 5:35 AM, Filip Kokosiński via Gcc-help wrote:
Hi all,
I'm working on a custom backend for a very simple accumulator-based machine.
While the port mostly works on lower optimization settings (-O0, -O1), it tends
to fail when trying to spill registers on higher ones (-O2, -O3, -Os):
main.c:49:1: error: unable to find a register to spill
49 | }
| ^
main.c:49:1: error: this is the insn:
(insn 1341 1350 245 44 (set (reg:QI 95 [ ivtmp.29 ])
(reg:QI 852 [851])) "main.c":45:5 3 {movqi}
(expr_list:REG_DEAD (reg:QI 852 [851])
(nil)))
during RTL pass: reload
main.c:49:1: internal compiler error: in lra_split_hard_reg_for, at
lra-assigns.cc:1868
Most of the instructions are defined in the following way:
(define_insn "addqi3"
[(set (match_operand:QI 0 "register_operand" "=a")
(plus:QI
(match_operand:QI 1 "register_operand" "0")
(match_operand:QI 2 "nonimmediate_operand" "rm")))]
""
"add\\t%2")
Where the `a` constraint is a register constraint for the single register
accumulator class.
I can get rid off this error and produce working code by using the
`-fno-inline-small-functions`, but I was wondering if there are some other
places I could tinker with.
Here are the approaches I've tried so far to reduce the pressure on the
accumulator register:
1. Defining the `TARGET_SPILL_CLASS` hook, and return `GENERAL_REGS` when
`rclass` is the accumulator register class. My target allows for indirect
addressing using memory cells, so I've defined a set of 16 "pseudo"
registers +
SP. I was hoping that some spilling would happen from the accumulator
register to the in-memory registers, removing the need for some
secondary reloads,
but alas this didn't help much.
2. Defining the `TARGET_SCHED_*` family of hooks - these ones are tricky, and
my understanding of them is lacking. I've tried to just recreate the
approach that the SuperH backend uses (R0 register pressure), but this
approach didn't help me either.
From reading the mailing list I gathered that GCC might not be best suited for
single-operand machines, but I'm eager to know if there are techniques I could
try employing, or if there are backends that I've missed but which had already
dealt with this problem.
One thing that I've also considered is hiding the accumulator register from the
compiler altogether. This idea feels like a nuclear option, though, and I'm
kind of reluctant to try it.
-f flags may work around the issue in some cases, but they are not a
real solution. Similarly TARGET_SCHED_* are options to improve
scheduling for the target, again, they may work around the problem in
some cases, but they're not a real solution.
Hiding the accumulator may be the best option. It's often tough to do
allocation of singleton register classes, particularly if they're
heavily used.
You might consider looking at the rl78 port which has a limited register
file. Essentially it uses a virtual register file to provide the
allocator and reloading phase with sensible enough target. An rl78
specific pass after register allocation then maps those virtuals down to
the physical register file.
Jeff