RE: Condition execution optimization with gcc 7.5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Richard,

I'm compiling for aarch64. Indeed, I was expecting conversion via conditional move or set.
I understand that code such as NGINX HTTP parser is suitable for such conversion. But I was expecting that, for example, this code can benefit of it (ngx_hash is an inline function and is a simple xor operation):
>>                  if (c) {
>>                      hash = ngx_hash(0, c);
>>                      r->lowcase_header[0] = c;
>>                      i = 1;
>>                      break;
>>                  }

Thank for your help and your answers.

Best,
Benjamin Minguez

-----Original Message-----
From: Richard Earnshaw (lists) <Richard.Earnshaw@xxxxxxx> 
Sent: Thursday, May 18, 2023 1:02 PM
To: Benjamin Minguez <benjamin.minguez@xxxxxxxxxx>; Kyrylo Tkachov <Kyrylo.Tkachov@xxxxxxx>; gcc-help@xxxxxxxxxxx
Subject: Re: Condition execution optimization with gcc 7.5

On 17/05/2023 09:17, Benjamin Minguez via Gcc-help wrote:
> Hello,
> 
> I did add -march=armv8-a (and the others armv8.*-a) to GCC command line, but it looks like the conditional execution optimization, cond_exec_find_if_block function, is never called. I enabled all gcc dumps (-da option) and this function debug message are never printed.

Just to be certain, are you compiling for aarch32 (arm/thumb), or aarch64?  The latter does not support conditional execution, except via instructions such as CSEL.

[more comments lower down]

> In parallel, I also try  with different version of GCC: 9.5.0 and 11.3.0, and again the I had the same results.
> 
>   Do you have any idea why the this optimization step is not called?
> 
> Thank you in advance for your help.
> 
> Best,
> Benjamin Minguez
> 
> -----Original Message-----
> From: Benjamin Minguez
> Sent: Wednesday, May 10, 2023 8:43 AM
> To: 'Kyrylo Tkachov' <Kyrylo.Tkachov@xxxxxxx>; gcc-help@xxxxxxxxxxx
> Subject: RE: Condition execution optimization with gcc 7.5
> 
> Hi,
> 
> Thank for the answer.
> 
> I had a look at the wrong function definition, gcc-7.5.0/gcc/target.def:
> 	DEFHOOK
> 	(have_conditional_execution,
> 	 "This target hook returns true if the target supports conditional execution.\n\
> 	This target hook is required only when the target has several different\n\
> 	modes and they have different conditional execution capability, such as ARM.",
> 	 bool, (void),
> 	 default_have_conditional_execution)
> and find this one,  gcc-7.5.0/gcc/targhooks.c:
> 	bool
> 	default_have_conditional_execution (void)
> 	{
> 	  return HAVE_conditional_execution;
> 	}
> Finally, the macro HAVE_conditional_execution is defined here: 
> build-gcc/gcc/insn-config.h,
> 
> I will investigate the -march or -mcpu option.
> 
> Again, thanks a lot,
> 
> Benjamin Minguez
> 
> -----Original Message-----
> From: Kyrylo Tkachov <Kyrylo.Tkachov@xxxxxxx>
> Sent: Tuesday, May 9, 2023 11:50 AM
> To: Benjamin Minguez <benjamin.minguez@xxxxxxxxxx>; 
> gcc-help@xxxxxxxxxxx
> Subject: RE: Condition execution optimization with gcc 7.5
> 
> Hi Benjamin,
> 
>> -----Original Message-----
>> From: Gcc-help <gcc-help-bounces+kyrylo.tkachov=arm.com@xxxxxxxxxxx>
>> On Behalf Of Benjamin Minguez via Gcc-help
>> Sent: Tuesday, May 9, 2023 8:54 AM
>> To: gcc-help@xxxxxxxxxxx
>> Subject: Condition execution optimization with gcc 7.5
>>
>> Hello everyone,
>>
>> I'm trying to optimize an application that contains a lot of branches.
>> I'm targeting armv8 processors and I'm using GCC 7.5.0 for compatibility reason.
> 
> Of course GCC 7.5 is quite old now but if you're forced to use it...
> 
>> As the original application is similar to NGINX, I investigated on 
>> NGINX. I'm focusing on the HTTP header parsing. Basically, the 
>> algorithm parse byte per byte and based on the value stores some variables.
>> Here is an example, /src/http/ngx_http_parse.c: ngx_http_parse_header_line
>>                  if (c) {
>>                      hash = ngx_hash(0, c);
>>                      r->lowcase_header[0] = c;
>>                      i = 1;
>>                      break;
>>                  }
>>
>>                  if (ch == '_') {
>>                      if (allow_underscores) {
>>                          hash = ngx_hash(0, ch);
>>                          r->lowcase_header[0] = ch;
>>                          i = 1;
>>
>>                      } else {
>>                          r->invalid_header = 1;
>>                      }
>>
>>                      break;
>>                  }

Your example code isn't complete enough to do a full analysis, but I doubt code like this would generate conditional execution anyway.  There are several reasons:

1) It's likely too long once machine instructions are generated
2) There are function calls (ngx_hash) in the body of the conditional blocks (calls cannot be conditionally executed); if they are inlined then see 1) above.
3) you have nested conditions (only the innermost block could be conditionally executed).
4) you wouldn't want to conditionally execute 'if (allow_underscores)' 
anyway as it's probably highly predictable as a branch.

R.

>> Also, most of branches are not predictable because it compares against
>> data coming from the network.
>>  From these observations, I looked at the conditional execution
>> optimization step in GCC and I found this function that should do the work:
>> cond_exec_find_if_block. And how to customize the decision to use
>> conditional instructions:
> 
> ... This relates to the arm port i.e. the 32-bit target in Armv8-a, is that what you're targeting?
> AArch64 has had more tuning work put into it over the years so may do better performance-wise if your processor and environment supports it.
> If you're indeed looking at arm...
> 
>>                  #define MAX_CONDITIONAL_EXECUTE
>> arm_max_conditional_execute ()
>>                  int
>>                  arm_max_conditional_execute (void)
>>                  {
>>                    return max_insns_skipped;
>>                  }
>>                  static int max_insns_skipped = 5;
>>
>> I tried to compile NGNIX in -O2 (that should enable if-conversion2)
>> but I did not noticed any change in the code. I enable GCC debug (-da)
>> and also add some debug in this function and I figure out that
>> targetm.have_conditional_execution is set to false.
>>
>> First, do you how to switch this variable to true. I guess it is an
>> option during the configuration step of GCC.
> 
> It's definition on that branch is:
> /* Only thumb1 can't support conditional execution, so return true if
>     the target is not thumb1.  */
> static bool
> arm_have_conditional_execution (void)
> {
>    return !TARGET_THUMB1;
> }
> 
> So it looks like you're maybe not setting the right -march or -mcpu option to enable the full armv8-a features?
> 
> Thanks,
> Kyrill
> 
>> Then, I know  that the decision to use conditional execution is based
>> on the extra cost added to compute both branches compare to the cost of a branch.
>> In this specific case, branches are miss predicted and the cost is, indeed, high.
>> Do you think that increasing the max_insns_skipped will be enough to
>> help GCC to use conditional execution?
>>
>> Thank you in advance for your answers.
>>
>> Best,
>> Benjamin Minguez

R.






[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux