On 11/5/18 10:51 AM, Jan Hubicka wrote: >> @honza: PING >> >> On 10/3/18 12:53 PM, Martin Liška wrote: >>> On 10/3/18 11:04 AM, Jan Hubicka wrote: >>>>> >>>>> That was promised to be done by Honza Hubička. He's very skilled in IPA optimizations and he's aware >>>>> of optimizations that cause troubles for live-patching. >>>> >>>> :) I am not sure how skilful I am, but here is what I arrived to. >>> >>> Heh! Thanks for the analysis. >>> >>>> >>>> We have transformations that are modeled as clonning, which are >>>> - inlining (can't be disabled completely because of always inline, but -fno-inline >>>> does most of stuff) >>>> - cloning (disabled via -fno-ipa-cp) >>>> - ipa-sra (-fno-ipa-sra) >>>> - splitting (-fno-partial-inlining) >>>> These should play well with Martin's tracking code >>> >>> I hope so! >>> >>>> >>>> We propagate info about side effects of function: >>>> - function attribute discovery (pure, const, nothrow, malloc) >>>> Some of this can be disabled by -fno-ipa-pure-const, but not all >>>> of it. >>> >>> Would it be possible to add option for the remaining ones? > > Sure, I can prepare patch unless you beat me :) Are you sure there's a call to 'analyze_function' where the analysis is done when one sets -fno-ipa-pure-const? >>> >>> Nothrow does not have flag but it is obviously not a concern >>>> for C++ >>> >>> s/C++/C? > > Yep for C >>> >>>> - ipa-pta (disabled by default, -fno-ipa-pta) >>>> - ipa-reference (list of accessed/modified global vars), disable by -fno-ipa-refernece >>>> - stack alignment requirements (no flag to disable) >>> >>> Would it be possible to add flag for it? Can you please point to a location where >>> the optimization happen? > > In expand_call > > /* Figure out the amount to which the stack should be aligned. */ > preferred_stack_boundary = PREFERRED_STACK_BOUNDARY; > if (fndecl) > { > struct cgraph_rtl_info *i = cgraph_node::rtl_info (fndecl); > /* Without automatic stack alignment, we can't increase preferred > stack boundary. With automatic stack alignment, it is > unnecessary since unless we can guarantee that all callers will > align the outgoing stack properly, callee has to align its > stack anyway. */ > if (i > && i->preferred_incoming_stack_boundary > && i->preferred_incoming_stack_boundary < preferred_stack_boundary) > preferred_stack_boundary = i->preferred_incoming_stack_boundary; > } I'm attaching patch candidate for that. > >>> >>>> - inter-procedural register allocation (-fno-ipa-ra) >>>> >>>> We perform discovery of functions/variables with no address taken and >>>> optimizations that are not valid otherwise such as duplicating them >>>> or doing skipping them for alias analysis (no flag to disable) >>> >>> Can you be please more verbose here? What optimizations do you mean? > > See ipa_discover_readonly_nonaddressable_vars. If addressable bit is > cleared we start analyzing uses of the variable via ipa_reference or so. > If writeonly bit is set, we start removing writes to the variable and if > readonly bit is set we skip any analysis about whether vairable changed. Likewise for this. >>> >>>> >>>> Identical code folding merges function bodies that are semanticaly equivalent >>>> and thus one can't patch one without patching another, -fno-ipa-icf >>> >>> Agree, I recommend disabling that. >>> >>>> >>>> Unreachable code/variable removal may be concern too (no flag to disable) >>> >>> For functions that should be fine and handled by my script. >>> For variables can be problem when a variable becomes alive But that >>> should be extremely rare for live-patching. >>> >>>> >>>> Write only global variable discovery (no flag to dosable) >>> >>> Similarly. >>> >>>> >>>> Visibility changes with -flto and/or -fwhole-program >>>> >>>> We also have profile propagation (discovery of cuntions used only in cold regions, >>>> but that I guess is only performance issue not correctness) >>>> No flag to disable >>> >>> Hope these 2 does not happen for current Linux kernel. > > 2 will happen in kernel. We will try to propagate cold code > inter-procedurally based on what we think will be undefined effect at > runtime. Still i guess it is not big deal as it only affects > size optimization. Then let's ignore it. Thoughts about the patches? Martin > > Honza >>> >>> Martin >>> >>>> >>>> Honza >>>> >>>>> >>>>> Martin >>>>> >>>>>> >>>>>> thanks. >>>>>> >>>>>> Qing >>>>>> >>>>> >>> >>
>From ee912514f61ec2c4d126cf6d43b69d01a08886c8 Mon Sep 17 00:00:00 2001 From: marxin <mliska@xxxxxxx> Date: Wed, 7 Nov 2018 13:47:40 +0100 Subject: [PATCH 2/2] Come up with the flag -fipa-stack-alignment. gcc/ChangeLog: 2018-11-07 Martin Liska <mliska@xxxxxxx> * common.opt: Add -fipa-stack-alignment flag. * doc/invoke.texi: Document it. * final.c (rest_of_clean_state): Guard stack shrinking with flag. gcc/testsuite/ChangeLog: 2018-11-07 Martin Liska <mliska@xxxxxxx> * gcc.target/i386/ipa-stack-alignment.c: New test. --- gcc/common.opt | 4 ++++ gcc/doc/invoke.texi | 7 ++++++- gcc/final.c | 3 ++- gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c | 13 +++++++++++++ 4 files changed, 25 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c diff --git a/gcc/common.opt b/gcc/common.opt index 6a64b0e27d5..6ee48fbcfc4 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1724,6 +1724,10 @@ fipa-reference-addressable Common Report Var(flag_ipa_reference_addressable) Init(0) Optimization Discover read-only and write-only addressable variables. +fipa-stack-alignment +Common Report Var(flag_ipa_stack_alignment) Init(1) Optimization +Reduce stack alignment on call sites if possible. + fipa-matrix-reorg Common Ignore Does nothing. Preserved for backward compatibility. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 82c6fa913e8..2332e643993 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -413,7 +413,7 @@ Objective-C and Objective-C++ Dialects}. -finline-small-functions -fipa-cp -fipa-cp-clone @gol -fipa-bit-cp -fipa-vrp @gol -fipa-pta -fipa-profile -fipa-pure-const -fipa-reference -fipa-reference-addressable @gol --fipa-icf -fira-algorithm=@var{algorithm} @gol +-fipa-stack-alignment -fipa-icf -fira-algorithm=@var{algorithm} @gol -fira-region=@var{region} -fira-hoist-pressure @gol -fira-loop-pressure -fno-ira-share-save-slots @gol -fno-ira-share-spill-slots @gol @@ -8901,6 +8901,11 @@ Enabled by default at @option{-O} and higher. Discover read-only and write-only addressable variables. Enabled by default at @option{-O} and higher. +@item -fipa-stack-alignment +@opindex fipa-stack-alignment +Reduce stack alignment on call sites if possible. +Enabled by default. + @item -fipa-pta @opindex fipa-pta Perform interprocedural pointer analysis and interprocedural modification diff --git a/gcc/final.c b/gcc/final.c index 6e61f1e17a8..0c1ac625f37 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -4890,7 +4890,8 @@ rest_of_clean_state (void) /* We can reduce stack alignment on call site only when we are sure that the function body just produced will be actually used in the final executable. */ - if (decl_binds_to_current_def_p (current_function_decl)) + if (flag_ipa_stack_alignment + && decl_binds_to_current_def_p (current_function_decl)) { unsigned int pref = crtl->preferred_stack_boundary; if (crtl->stack_alignment_needed > crtl->preferred_stack_boundary) diff --git a/gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c b/gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c new file mode 100644 index 00000000000..1176b59aa5f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-fno-ipa-stack-alignment -O" } */ + +typedef struct { + long a; + long b[]; +} c; + +c *d; +void e() { d->b[0] = 5; } +void f() { e(); } + +/* { dg-final { scan-assembler "sub.*%.sp" } } */ -- 2.19.1
>From 8691490a142228021ed65313a72d176d06966829 Mon Sep 17 00:00:00 2001 From: marxin <mliska@xxxxxxx> Date: Wed, 7 Nov 2018 13:31:41 +0100 Subject: [PATCH 1/2] Come up with -fipa-reference-addressable flag. gcc/ChangeLog: 2018-11-07 Martin Liska <mliska@xxxxxxx> * cgraph.h (ipa_discover_readonly_nonaddressable_vars): Rename to ... (ipa_discover_nonaddressable_vars): ... this. * common.opt: Come up with new flag -fipa-reference-addressable. * doc/invoke.texi: Document it. * ipa-reference.c (propagate): Call the renamed fn. * ipa-visibility.c (whole_program_function_and_variable_visibility): Likewise. * ipa.c (ipa_discover_readonly_nonaddressable_vars): Renamed to ... (ipa_discover_nonaddressable_vars): ... this. Discove non-addressable variables only with the newly added flag. * opts.c: Enable the newly added flag with -O1 and higher optimization level. gcc/testsuite/ChangeLog: 2018-11-07 Martin Liska <mliska@xxxxxxx> * gcc.dg/tree-ssa/writeonly-2.c: New test. --- gcc/cgraph.h | 2 +- gcc/common.opt | 6 +++++- gcc/doc/invoke.texi | 10 ++++++++-- gcc/ipa-reference.c | 2 +- gcc/ipa-visibility.c | 2 +- gcc/ipa.c | 11 +++++++---- gcc/opts.c | 1 + gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c | 20 ++++++++++++++++++++ 8 files changed, 44 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c diff --git a/gcc/cgraph.h b/gcc/cgraph.h index c13d79850fa..bf65d426cda 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -2403,7 +2403,7 @@ void record_references_in_initializer (tree, bool); /* In ipa.c */ void cgraph_build_static_cdtor (char which, tree body, int priority); -bool ipa_discover_readonly_nonaddressable_vars (void); +bool ipa_discover_nonaddressable_vars (void); /* In varpool.c */ tree ctor_for_folding (tree); diff --git a/gcc/common.opt b/gcc/common.opt index 2971dc21b1f..6a64b0e27d5 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1718,7 +1718,11 @@ Perform Identical Code Folding for variables. fipa-reference Common Report Var(flag_ipa_reference) Init(0) Optimization -Discover readonly and non addressable static variables. +Discover read-only and non addressable static variables. + +fipa-reference-addressable +Common Report Var(flag_ipa_reference_addressable) Init(0) Optimization +Discover read-only and write-only addressable variables. fipa-matrix-reorg Common Ignore diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ae260c6ac6d..82c6fa913e8 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -412,8 +412,8 @@ Objective-C and Objective-C++ Dialects}. -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol -finline-small-functions -fipa-cp -fipa-cp-clone @gol -fipa-bit-cp -fipa-vrp @gol --fipa-pta -fipa-profile -fipa-pure-const -fipa-reference -fipa-icf @gol --fira-algorithm=@var{algorithm} @gol +-fipa-pta -fipa-profile -fipa-pure-const -fipa-reference -fipa-reference-addressable @gol +-fipa-icf -fira-algorithm=@var{algorithm} @gol -fira-region=@var{region} -fira-hoist-pressure @gol -fira-loop-pressure -fno-ira-share-save-slots @gol -fno-ira-share-spill-slots @gol @@ -7866,6 +7866,7 @@ compilation time. -fipa-pure-const @gol -fipa-profile @gol -fipa-reference @gol +-fipa-reference-addressable @gol -fmerge-constants @gol -fmove-loop-invariants @gol -fomit-frame-pointer @gol @@ -8895,6 +8896,11 @@ Discover which static variables do not escape the compilation unit. Enabled by default at @option{-O} and higher. +@item -fipa-reference-addressable +@opindex fipa-reference-addressable +Discover read-only and write-only addressable variables. +Enabled by default at @option{-O} and higher. + @item -fipa-pta @opindex fipa-pta Perform interprocedural pointer analysis and interprocedural modification diff --git a/gcc/ipa-reference.c b/gcc/ipa-reference.c index 43bbdae5d66..2cdce3cbfa6 100644 --- a/gcc/ipa-reference.c +++ b/gcc/ipa-reference.c @@ -705,7 +705,7 @@ propagate (void) if (dump_file) cgraph_node::dump_cgraph (dump_file); - remove_p = ipa_discover_readonly_nonaddressable_vars (); + remove_p = ipa_discover_nonaddressable_vars (); generate_summary (); /* Propagate the local information through the call graph to produce diff --git a/gcc/ipa-visibility.c b/gcc/ipa-visibility.c index 000207fa31b..1da594111f8 100644 --- a/gcc/ipa-visibility.c +++ b/gcc/ipa-visibility.c @@ -911,7 +911,7 @@ whole_program_function_and_variable_visibility (void) { function_and_variable_visibility (flag_whole_program); if (optimize || in_lto_p) - ipa_discover_readonly_nonaddressable_vars (); + ipa_discover_nonaddressable_vars (); return 0; } diff --git a/gcc/ipa.c b/gcc/ipa.c index 3b6b5e5c8d4..eb53e7dcd06 100644 --- a/gcc/ipa.c +++ b/gcc/ipa.c @@ -752,10 +752,10 @@ clear_addressable_bit (varpool_node *vnode, void *data ATTRIBUTE_UNUSED) return false; } -/* Discover variables that have no longer address taken or that are read only - and update their flags. +/* Discover variables that have no longer address taken, are read-only or + write-only and update their flags. - Return true when unreachable symbol removan should be done. + Return true when unreachable symbol removal should be done. FIXME: This can not be done in between gimplify and omp_expand since readonly flag plays role on what is shared and what is not. Currently we do @@ -764,8 +764,11 @@ clear_addressable_bit (varpool_node *vnode, void *data ATTRIBUTE_UNUSED) make sense to do it before early optimizations. */ bool -ipa_discover_readonly_nonaddressable_vars (void) +ipa_discover_nonaddressable_vars (void) { + if (!flag_ipa_reference_addressable) + return false; + bool remove_p = false; varpool_node *vnode; if (dump_file) diff --git a/gcc/opts.c b/gcc/opts.c index 34c283dd765..9b9018c6c48 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -451,6 +451,7 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion2, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_pure_const, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_reference, NULL, 1 }, + { OPT_LEVELS_1_PLUS, OPT_fipa_reference_addressable, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_profile, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fmerge_constants, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_freorder_blocks, NULL, 1 }, diff --git a/gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c b/gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c new file mode 100644 index 00000000000..2272d15b171 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -fdump-tree-optimized -fno-ipa-reference-addressable" } */ +static struct a {int magic1,b;} a; +volatile int magic2; +static struct b {int a,b,c,d,e,f;} magic3; + +struct b foo(); + +void +t() +{ + a.magic1 = 1; + magic2 = 1; + magic3 = foo(); +} +/* { dg-final { scan-tree-dump "magic1" "optimized"} } */ +/* { dg-final { scan-tree-dump "magic3" "optimized"} } */ +/* { dg-final { scan-tree-dump "magic2" "optimized"} } */ +/* { dg-final { scan-tree-dump "foo" "optimized"} } */ + -- 2.19.1