在 2021/2/25 上午5:40, Jim Wilson 写道:
On Wed, Feb 24, 2021 at 6:18 AM Jiaxun Yang <jiaxun.yang@xxxxxxxxxxx
<mailto:jiaxun.yang@xxxxxxxxxxx>> wrote:
I found it's very difficult for GCC to generate this kind of pcrel_lo
expression,
RTX label_ref can't be lower into such LOW_SUM expression.
Yes, it is difficult. You need to generate a label, and put the label
number in an unspec in the auipc pattern, and then create a label_ref
to put in the addi. The fact that we have an unspec and a label_ref
means a number of optimizations get disabled, like basic block
duplication and loop unrolling, because they can't make a copy of an
instruction that uses a label as data, as they have no way to know how
to duplicate the label itself. Or at least RISC-V needs to create one
label. You probably need to create two labels.
There is a far easier way to do this, which is to just emit an
assembler macro, and let the assembler generate the labels and
relocs. This is what the RISC-V GCC port does by default. This
prevents some optimizations like scheduling the two instructions, but
enables some other optimizations like loop unrolling. So it is a
tossup. Sometimes we get better code with the assembler macro, and
sometimes we get better code by emitting the auipc and addi separately.
Thanks all,
I'll take this approach first, add "lla, dlla" pseudo-instructions to
assembler and seeking optimization
in future.
Btw I found we don't have any document for MIPS pseudo-instructions.
RISC-V put them in ISA manual
but it is not the case for MIPS. Is it possible to have one in binutils?
The RISC-V gcc port can emit the auipc/addi with
-mexplicit-relocs -mcode-model=medany, but this is known to sometimes
fail. The problem is that if you have an 8-byte variable with 8-byte
alignment, and try to load it with 2 4-byte loads, gcc knows that
offset+4 must be safe from overflow because the data is 8-byte
aligned. However, when you use a pc-relative offset that is data
address-code address, the offset is only as aligned as the code is.
RISC-V has 2-byte instruction alignment with the C extension. So if
you have offset+4 and offset is only 2-byte aligned, it is possible
that offset+4 may overflow the add immediate field. The same thing
can happen with 16-byte data that is 16-byte aligned, accessed with
two 8-byte loads. There is no easy software solution. We just emit a
linker error in that case as we can't do anything else. I think this
would work better if auipc cleared some low bits of the result, in
which case the pc-relative offset would have enough alignment to
prevent overflow when adding small offsets, but it is far too late to
change how the RISC-V auipc works.
Got your point, thanks for the remainder!
If it looks infeasible for GCC side, another option would be adding
RISC-V style
%pcrel_{hi,lo} modifier at assembler side. We can add another pair of
modifier
like %pcrel_paired_{hi,lo} to implement the behavior. Would it be
a good
idea?
I wouldn't recommend following the RISC-V approach for the relocation.
Thanks.
- Jiaxun
Jim