Eliminating unnecessary loads on SPARC + IA64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm investigating some performance issues with the SPARC
and IA64 ports. I've included some sample code and the 
resulting assembler produced on both SPARC + IA64. 
In the function below, writec; fp->length is loaded 
twice causing a fairly expensive stall for load data 
in the processor pipeline. See the code.c below.


But why doesn't gcc recognise that fp->Length is already 
kept in a register? Is this down to pointer aliasing? 
GCC is treating variable, Length as if it was declared 
volatile. Are there target description macros (in 
the relevant backends of gcc) that I can tweak to 
get better code? I'm sure other peoples' codes would 
get a speedup.


Note that I can remove the additional loads by rewriting the code and
declaring a local variable on the stack like so:

void 
writec_new (unsigned char c, FP *fp)
{
    int idx = fp->Length;
    fp->ptr[fp->Length] = c;
    fp->Length = ++idx;	
}

But it is unreasonable to rewrite our code base ...

Cheers

John


code.c (gcc -O2 -S code.c -o code.s)
===============================================
typedef struct 
{
    unsigned char *ptr;
    int CurrPos;
    int Length;
    int pack;
} FP;

void
writec (unsigned char c, FP *fp)
{
    fp->ptr[fp->Length] = c;
    ++fp->Length;
}
=================================================


code.s (SPARC)
=================================================
	.file	"code.c"
	.section	".text"
	.align 4
	.global writec
	.type	writec, #function
	.proc	020
writec:
	!#PROLOGUE# 0
	!#PROLOGUE# 1
	ld	[%o1+8], %o4     <<<<<<<<<<<< fp->Length Loaded here
	ld	[%o1], %o5
	stb	%o0, [%o5+%o4]
	ld	[%o1+8], %g1     <<<<<<<<<<<< Why do we reload fp->Length it's in %o4
	add	%g1, 1, %g1
	retl
	st	%g1, [%o1+8]
	.size	writec, .-writec
	.ident	"GCC: (GNU) 3.4.2"
===================================================


code.s (IA64)
===================================================
	.file	"code.c"
	.pred.safe_across_calls p1-p5,p16-p63
.text
	.align 16
	.global writec#
	.proc writec#
writec:
	.prologue
	.body
	adds r16 = 12, r33
	ld8 r15 = [r33]
	;;
	ld4 r14 = [r16]        <<<<<<<<< fp->Length loaded into r14
	;;
	sxt4 r14 = r14         <<<<<<<<< Just a sign extend insn
	;;
	add r15 = r15, r14
	;;
	st1 [r15] = r32
	ld4 r14 = [r16]         <<<<<<<< fp->Length reloaded here into r14 but it hasn't changed
	;;
	adds r14 = 1, r14
	;;
	st4 [r16] = r14
	br.ret.sptk.many b0
	.endp writec#
	.ident	"GCC: (GNU) 2.96 20000731 (Red Hat Linux 7.2 2.96-128.7.2)"
===================================================


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux