Hi Benjamin, > -----Original Message----- > From: Gcc-help <gcc-help-bounces+kyrylo.tkachov=arm.com@xxxxxxxxxxx> > On Behalf Of Benjamin Minguez via Gcc-help > Sent: Tuesday, May 9, 2023 8:54 AM > To: gcc-help@xxxxxxxxxxx > Subject: Condition execution optimization with gcc 7.5 > > Hello everyone, > > I'm trying to optimize an application that contains a lot of branches. I'm > targeting armv8 processors and I'm using GCC 7.5.0 for compatibility reason. Of course GCC 7.5 is quite old now but if you're forced to use it... > As the original application is similar to NGINX, I investigated on NGINX. I'm > focusing on the HTTP header parsing. Basically, the algorithm parse byte per > byte and based on the value stores some variables. > Here is an example, /src/http/ngx_http_parse.c: ngx_http_parse_header_line > if (c) { > hash = ngx_hash(0, c); > r->lowcase_header[0] = c; > i = 1; > break; > } > > if (ch == '_') { > if (allow_underscores) { > hash = ngx_hash(0, ch); > r->lowcase_header[0] = ch; > i = 1; > > } else { > r->invalid_header = 1; > } > > break; > } > Also, most of branches are not predictable because it compares against data > coming from the network. > From these observations, I looked at the conditional execution optimization > step in GCC and I found this function that should do the work: > cond_exec_find_if_block. And how to customize the decision to use > conditional instructions: ... This relates to the arm port i.e. the 32-bit target in Armv8-a, is that what you're targeting? AArch64 has had more tuning work put into it over the years so may do better performance-wise if your processor and environment supports it. If you're indeed looking at arm... > #define MAX_CONDITIONAL_EXECUTE > arm_max_conditional_execute () > int > arm_max_conditional_execute (void) > { > return max_insns_skipped; > } > static int max_insns_skipped = 5; > > I tried to compile NGNIX in -O2 (that should enable if-conversion2) but I did > not noticed any change in the code. I enable GCC debug (-da) and also add > some debug in this function and I figure out that > targetm.have_conditional_execution is set to false. > > First, do you how to switch this variable to true. I guess it is an option during > the configuration step of GCC. It's definition on that branch is: /* Only thumb1 can't support conditional execution, so return true if the target is not thumb1. */ static bool arm_have_conditional_execution (void) { return !TARGET_THUMB1; } So it looks like you're maybe not setting the right -march or -mcpu option to enable the full armv8-a features? Thanks, Kyrill > Then, I know that the decision to use conditional execution is based on the > extra cost added to compute both branches compare to the cost of a branch. > In this specific case, branches are miss predicted and the cost is, indeed, high. > Do you think that increasing the max_insns_skipped will be enough to help > GCC to use conditional execution? > > Thank you in advance for your answers. > > Best, > Benjamin Minguez