RE: Condition execution optimization with gcc 7.5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Benjamin,

> -----Original Message-----
> From: Gcc-help <gcc-help-bounces+kyrylo.tkachov=arm.com@xxxxxxxxxxx>
> On Behalf Of Benjamin Minguez via Gcc-help
> Sent: Tuesday, May 9, 2023 8:54 AM
> To: gcc-help@xxxxxxxxxxx
> Subject: Condition execution optimization with gcc 7.5
> 
> Hello everyone,
> 
> I'm trying to optimize an application that contains a lot of branches. I'm
> targeting armv8 processors and I'm using GCC 7.5.0 for compatibility reason.

Of course GCC 7.5 is quite old now but if you're forced to use it...

> As the original application is similar to NGINX, I investigated on NGINX. I'm
> focusing on the HTTP header parsing. Basically, the algorithm parse byte per
> byte and based on the value stores some variables.
> Here is an example, /src/http/ngx_http_parse.c: ngx_http_parse_header_line
>                 if (c) {
>                     hash = ngx_hash(0, c);
>                     r->lowcase_header[0] = c;
>                     i = 1;
>                     break;
>                 }
> 
>                 if (ch == '_') {
>                     if (allow_underscores) {
>                         hash = ngx_hash(0, ch);
>                         r->lowcase_header[0] = ch;
>                         i = 1;
> 
>                     } else {
>                         r->invalid_header = 1;
>                     }
> 
>                     break;
>                 }
> Also, most of branches are not predictable because it compares against data
> coming from the network.
> From these observations, I looked at the conditional execution optimization
> step in GCC and I found this function that should do the work:
> cond_exec_find_if_block. And how to customize the decision to use
> conditional instructions:

... This relates to the arm port i.e. the 32-bit target in Armv8-a, is that what you're targeting?
AArch64 has had more tuning work put into it over the years so may do better performance-wise if your processor and environment supports it.
If you're indeed looking at arm...

>                 #define MAX_CONDITIONAL_EXECUTE
> arm_max_conditional_execute ()
>                 int
>                 arm_max_conditional_execute (void)
>                 {
>                   return max_insns_skipped;
>                 }
>                 static int max_insns_skipped = 5;
> 
> I tried to compile NGNIX in -O2 (that should enable if-conversion2) but I did
> not noticed any change in the code. I enable GCC debug (-da) and also add
> some debug in this function and I figure out that
> targetm.have_conditional_execution is set to false.
> 
> First, do you how to switch this variable to true. I guess it is an option during
> the configuration step of GCC.

It's definition on that branch is:
/* Only thumb1 can't support conditional execution, so return true if
   the target is not thumb1.  */
static bool
arm_have_conditional_execution (void)
{
  return !TARGET_THUMB1;
}

So it looks like you're maybe not setting the right -march or -mcpu option to enable the full armv8-a features?

Thanks,
Kyrill

> Then, I know  that the decision to use conditional execution is based on the
> extra cost added to compute both branches compare to the cost of a branch.
> In this specific case, branches are miss predicted and the cost is, indeed, high.
> Do you think that increasing the max_insns_skipped will be enough to help
> GCC to use conditional execution?
> 
> Thank you in advance for your answers.
> 
> Best,
> Benjamin Minguez




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux