RE: Condition execution optimization with gcc 7.5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Thank for the answer.

I had a look at the wrong function definition, gcc-7.5.0/gcc/target.def: 
	DEFHOOK
	(have_conditional_execution,
	 "This target hook returns true if the target supports conditional execution.\n\
	This target hook is required only when the target has several different\n\
	modes and they have different conditional execution capability, such as ARM.",
	 bool, (void),
	 default_have_conditional_execution)
and find this one,  gcc-7.5.0/gcc/targhooks.c:
	bool
	default_have_conditional_execution (void)
	{
	  return HAVE_conditional_execution;
	}
Finally, the macro HAVE_conditional_execution is defined here: build-gcc/gcc/insn-config.h, 

I will investigate the -march or -mcpu option.

Again, thanks a lot,

Benjamin Minguez

-----Original Message-----
From: Kyrylo Tkachov <Kyrylo.Tkachov@xxxxxxx> 
Sent: Tuesday, May 9, 2023 11:50 AM
To: Benjamin Minguez <benjamin.minguez@xxxxxxxxxx>; gcc-help@xxxxxxxxxxx
Subject: RE: Condition execution optimization with gcc 7.5

Hi Benjamin,

> -----Original Message-----
> From: Gcc-help <gcc-help-bounces+kyrylo.tkachov=arm.com@xxxxxxxxxxx>
> On Behalf Of Benjamin Minguez via Gcc-help
> Sent: Tuesday, May 9, 2023 8:54 AM
> To: gcc-help@xxxxxxxxxxx
> Subject: Condition execution optimization with gcc 7.5
> 
> Hello everyone,
> 
> I'm trying to optimize an application that contains a lot of branches. 
> I'm targeting armv8 processors and I'm using GCC 7.5.0 for compatibility reason.

Of course GCC 7.5 is quite old now but if you're forced to use it...

> As the original application is similar to NGINX, I investigated on 
> NGINX. I'm focusing on the HTTP header parsing. Basically, the 
> algorithm parse byte per byte and based on the value stores some variables.
> Here is an example, /src/http/ngx_http_parse.c: ngx_http_parse_header_line
>                 if (c) {
>                     hash = ngx_hash(0, c);
>                     r->lowcase_header[0] = c;
>                     i = 1;
>                     break;
>                 }
> 
>                 if (ch == '_') {
>                     if (allow_underscores) {
>                         hash = ngx_hash(0, ch);
>                         r->lowcase_header[0] = ch;
>                         i = 1;
> 
>                     } else {
>                         r->invalid_header = 1;
>                     }
> 
>                     break;
>                 }
> Also, most of branches are not predictable because it compares against 
> data coming from the network.
> From these observations, I looked at the conditional execution 
> optimization step in GCC and I found this function that should do the work:
> cond_exec_find_if_block. And how to customize the decision to use 
> conditional instructions:

... This relates to the arm port i.e. the 32-bit target in Armv8-a, is that what you're targeting?
AArch64 has had more tuning work put into it over the years so may do better performance-wise if your processor and environment supports it.
If you're indeed looking at arm...

>                 #define MAX_CONDITIONAL_EXECUTE 
> arm_max_conditional_execute ()
>                 int
>                 arm_max_conditional_execute (void)
>                 {
>                   return max_insns_skipped;
>                 }
>                 static int max_insns_skipped = 5;
> 
> I tried to compile NGNIX in -O2 (that should enable if-conversion2) 
> but I did not noticed any change in the code. I enable GCC debug (-da) 
> and also add some debug in this function and I figure out that 
> targetm.have_conditional_execution is set to false.
> 
> First, do you how to switch this variable to true. I guess it is an 
> option during the configuration step of GCC.

It's definition on that branch is:
/* Only thumb1 can't support conditional execution, so return true if
   the target is not thumb1.  */
static bool
arm_have_conditional_execution (void)
{
  return !TARGET_THUMB1;
}

So it looks like you're maybe not setting the right -march or -mcpu option to enable the full armv8-a features?

Thanks,
Kyrill

> Then, I know  that the decision to use conditional execution is based 
> on the extra cost added to compute both branches compare to the cost of a branch.
> In this specific case, branches are miss predicted and the cost is, indeed, high.
> Do you think that increasing the max_insns_skipped will be enough to 
> help GCC to use conditional execution?
> 
> Thank you in advance for your answers.
> 
> Best,
> Benjamin Minguez




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux