William Tambe via Gcc-help <gcc-help@xxxxxxxxxxx> writes: > Given following program: > > unsigned char var; > int main() { > return var; > } > > And compiled using: > pu32-elf-gcc -O3 -c -save-temps test.c > > Unnecessary zero-extension gets generated after a memory byte load > which already zero-extend. > > LOAD_EXTEND_OP has been defined as follow: > #define LOAD_EXTEND_OP(M) ZERO_EXTEND > > Find complete port at: > https://github.com/fontamsoc/gcc/commit/45840063 > And machine description at: > https://github.com/fontamsoc/gcc/blob/45840063/gcc/config/pu32/pu32.md > > Any idea what else can be tried to prevent the unnecessary zero-extension ? (Thanks for sharing the links. Unfortunately I can't look at unsubmitted code for copyright reasons, so the below is just a guess.) If you define LOAD_EXTEND_OP, it's still better to have a define_insn that can zero_extend a memory source operand to a wider register destination operand. Ideally there should be one instruction that handles both registers and memory -- rather than than two separate instructions -- since that helps the register allocator to produce better results. E.g. the aarch64 pattern for this operation is: (define_insn "*zero_extend<SHORT:mode><GPI:mode>2_aarch64" [(set (match_operand:GPI 0 "register_operand" "=r,r,w,r") (zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m,m,w")))] "" "@ and\t%<GPI:w>0, %<GPI:w>1, <SHORT:short_mask> ldr<SHORT:size>\t%w0, %1 ldr\t%<SHORT:size>0, %1 umov\t%w0, %1.<SHORT:size>[0]" [(set_attr "type" "logic_imm,load_4,f_loads,neon_to_gp") (set_attr "arch" "*,*,fp,fp")] ) which is quite complicated, but it's the first two alternatives that matter here. Thanks, Richard