Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/04/2011 12:29 PM, Erik Faye-Lund wrote:
> 
> Wouldn't this part be cleaner as a constant-length loop? Any
> optimizing compiler should end up unrolling this, and we don't get as
> much code-duplication...
> 

It would.

> Oddly enough, this change gave me a drastic (>  20%) performance
> increase in my test (isolating get_mode in a separate compilation
> unit, and calling it in a loop):
> 
> diff --git a/tree-walk.c b/tree-walk.c
> index b8d504b..114ad63 100644
> --- a/tree-walk.c
> +++ b/tree-walk.c
> @@ -7,42 +7,30 @@
>   static const char *get_mode(const char *str, unsigned int *modep)
>   {
>   	unsigned char c;
> -	unsigned int mode = 0;
> +	unsigned int mode = 0, i;
> 
>   	/*
> -	 * Unroll what looks like a loop since the bounds are
> +	 * Allow the compiler to unroll the loop since the bounds are
>   	 * well-known. There should be at least 5 and at most 6
>   	 * characters available in any valid mode, as '40000' is the
>   	 * shortest while '160000' (S_IFGITLINK) is the longest.
>   	 */
> -	/* char 1 */
> -	c = *str++;
> -	if (c<  '0' || c>  '7') return NULL;
> -	mode = (mode<<  3) + (c - '0');
> -	/* char 2 */
> -	c = *str++;
> -	if (c<  '0' || c>  '7') return NULL;
> -	mode = (mode<<  3) + (c - '0');
> -	/* char 3 */
> -	c = *str++;
> -	if (c<  '0' || c>  '7') return NULL;
> -	mode = (mode<<  3) + (c - '0');
> -	/* char 4 */
> -	c = *str++;
> -	if (c<  '0' || c>  '7') return NULL;
> -	mode = (mode<<  3) + (c - '0');
> -	/* char 5 */
> -	c = *str++;
> -	if (c<  '0' || c>  '7') return NULL;
> -	mode = (mode<<  3) + (c - '0');
> +	for (i = 0; i<  5; ++i) {

s/i< 5/i < 5/ (nitpicking, yes)

> +		c = *str++;
> +		if (c<  '0' || c>  '7')
> +			return NULL;
> +		mode = (mode<<  3) + (c - '0');

This could be (micro-)optimized further as:

--%<--%<--
for (i = 0; i < 5; i++) {
	int c = *str++ - '0';
	if (c & ~7) {
		/* 5 chars is ok, so long as *str is now a space */
		if (i == 5 && c == ' ') {
			*modep = mode;
			return str;
		}
		return NULL;
	}
	mode = (mode << 3) + c;
}

/* this should be a space, or we've got a malformed entry */
if (*str != ' ')
	return NULL;

return str + 1;
--%<--%<--

This should be slightly faster since there's now only one comparison
inside the loop and since one branch contains a second branch fork-point
and an inevitable early return the branch prediction machinery in gcc
will favor the most common case properly, but pre-computations on 'mode'
have to be undone in one case of hitting the early return, it might be
faster to play pickup after a loop to 5 is done.

-- 
Andreas Ericsson                   andreas.ericsson@xxxxxx
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]