Re: GCC options for kernel live-patching (Was: Add a new option to control inlining only on static functions)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/5/18 10:51 AM, Jan Hubicka wrote:
>> @honza: PING
>>
>> On 10/3/18 12:53 PM, Martin Liška wrote:
>>> On 10/3/18 11:04 AM, Jan Hubicka wrote:
>>>>>
>>>>> That was promised to be done by Honza Hubička. He's very skilled in IPA optimizations and he's aware
>>>>> of optimizations that cause troubles for live-patching.
>>>>
>>>> :) I am not sure how skilful I am, but here is what I arrived to.
>>>
>>> Heh! Thanks for the analysis.
>>>
>>>>
>>>>  We have transformations that are modeled as clonning, which are
>>>>   - inlining  (can't be disabled completely because of always inline, but -fno-inline
>>>>     does most of stuff)
>>>>   - cloning (disabled via -fno-ipa-cp)
>>>>   - ipa-sra (-fno-ipa-sra)
>>>>   - splitting (-fno-partial-inlining)
>>>>  These should play well with Martin's tracking code
>>>
>>> I hope so!
>>>
>>>>
>>>>  We propagate info about side effects of function:
>>>>   - function attribute discovery (pure, const, nothrow, malloc)
>>>>     Some of this can be disabled by -fno-ipa-pure-const, but not all
>>>>     of it.
>>>
>>> Would it be possible to add option for the remaining ones?
> 
> Sure, I can prepare patch unless you beat me :)

Are you sure there's a call to 'analyze_function' where the analysis is done
when one sets -fno-ipa-pure-const?

>>>
>>> Nothrow does not have flag but it is obviously not a concern
>>>>     for C++
>>>
>>> s/C++/C?
> 
> Yep for C
>>>
>>>>   - ipa-pta (disabled by default, -fno-ipa-pta)
>>>>   - ipa-reference (list of accessed/modified global vars), disable by -fno-ipa-refernece
>>>>   - stack alignment requirements (no flag to disable)
>>>
>>> Would it be possible to add flag for it? Can you please point to a location where
>>> the optimization happen?
> 
> In expand_call
> 
>   /* Figure out the amount to which the stack should be aligned.  */
>   preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
>   if (fndecl)
>     {
>       struct cgraph_rtl_info *i = cgraph_node::rtl_info (fndecl);
>       /* Without automatic stack alignment, we can't increase preferred
>          stack boundary.  With automatic stack alignment, it is
>          unnecessary since unless we can guarantee that all callers will
>          align the outgoing stack properly, callee has to align its
>          stack anyway.  */
>       if (i
>           && i->preferred_incoming_stack_boundary
>           && i->preferred_incoming_stack_boundary < preferred_stack_boundary)
>         preferred_stack_boundary = i->preferred_incoming_stack_boundary;
>     }

I'm attaching patch candidate for that.

> 
>>>
>>>>   - inter-procedural register allocation (-fno-ipa-ra)
>>>>
>>>>  We perform discovery of functions/variables with no address taken and
>>>>  optimizations that are not valid otherwise such as duplicating them
>>>>  or doing skipping them for alias analysis (no flag to disable)
>>>
>>> Can you be please more verbose here? What optimizations do you mean?
> 
> See ipa_discover_readonly_nonaddressable_vars. If addressable bit is
> cleared we start analyzing uses of the variable via ipa_reference or so.
> If writeonly bit is set, we start removing writes to the variable and if
> readonly bit is set we skip any analysis about whether vairable changed.

Likewise for this.

>>>
>>>>
>>>>  Identical code folding merges function bodies that are semanticaly equivalent
>>>>  and thus one can't patch one without patching another, -fno-ipa-icf
>>>
>>> Agree, I recommend disabling that.
>>>
>>>>
>>>>  Unreachable code/variable removal may be concern too (no flag to disable)
>>>
>>> For functions that should be fine and handled by my script.
>>> For variables can be problem when a variable becomes alive But that
>>> should be extremely rare for live-patching.
>>>
>>>>
>>>>  Write only global variable discovery (no flag to dosable)
>>>
>>> Similarly.
>>>
>>>>
>>>>  Visibility changes with -flto and/or -fwhole-program
>>>>
>>>>  We also have profile propagation (discovery of cuntions used only in cold regions,
>>>>  but that I guess is only performance issue not correctness)
>>>>  No flag to disable
>>>
>>> Hope these 2 does not happen for current Linux kernel.
> 
> 2 will happen in kernel.  We will try to propagate cold code
> inter-procedurally based on what we think will be undefined effect at
> runtime.  Still i guess it is not big deal as it only affects 
> size optimization.

Then let's ignore it.

Thoughts about the patches?
Martin

> 
> Honza
>>>
>>> Martin
>>>
>>>>
>>>> Honza
>>>>
>>>>>
>>>>> Martin
>>>>>
>>>>>>
>>>>>> thanks.
>>>>>>
>>>>>> Qing
>>>>>>
>>>>>
>>>
>>

>From ee912514f61ec2c4d126cf6d43b69d01a08886c8 Mon Sep 17 00:00:00 2001
From: marxin <mliska@xxxxxxx>
Date: Wed, 7 Nov 2018 13:47:40 +0100
Subject: [PATCH 2/2] Come up with the flag -fipa-stack-alignment.

gcc/ChangeLog:

2018-11-07  Martin Liska  <mliska@xxxxxxx>

	* common.opt: Add -fipa-stack-alignment flag.
	* doc/invoke.texi: Document it.
	* final.c (rest_of_clean_state): Guard stack
	shrinking with flag.

gcc/testsuite/ChangeLog:

2018-11-07  Martin Liska  <mliska@xxxxxxx>

	* gcc.target/i386/ipa-stack-alignment.c: New test.
---
 gcc/common.opt                                      |  4 ++++
 gcc/doc/invoke.texi                                 |  7 ++++++-
 gcc/final.c                                         |  3 ++-
 gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c | 13 +++++++++++++
 4 files changed, 25 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c

diff --git a/gcc/common.opt b/gcc/common.opt
index 6a64b0e27d5..6ee48fbcfc4 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1724,6 +1724,10 @@ fipa-reference-addressable
 Common Report Var(flag_ipa_reference_addressable) Init(0) Optimization
 Discover read-only and write-only addressable variables.
 
+fipa-stack-alignment
+Common Report Var(flag_ipa_stack_alignment) Init(1) Optimization
+Reduce stack alignment on call sites if possible.
+
 fipa-matrix-reorg
 Common Ignore
 Does nothing. Preserved for backward compatibility.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 82c6fa913e8..2332e643993 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -413,7 +413,7 @@ Objective-C and Objective-C++ Dialects}.
 -finline-small-functions  -fipa-cp  -fipa-cp-clone @gol
 -fipa-bit-cp -fipa-vrp @gol
 -fipa-pta  -fipa-profile  -fipa-pure-const  -fipa-reference  -fipa-reference-addressable @gol
--fipa-icf  -fira-algorithm=@var{algorithm} @gol
+-fipa-stack-alignment  -fipa-icf  -fira-algorithm=@var{algorithm} @gol
 -fira-region=@var{region}  -fira-hoist-pressure @gol
 -fira-loop-pressure  -fno-ira-share-save-slots @gol
 -fno-ira-share-spill-slots @gol
@@ -8901,6 +8901,11 @@ Enabled by default at @option{-O} and higher.
 Discover read-only and write-only addressable variables.
 Enabled by default at @option{-O} and higher.
 
+@item -fipa-stack-alignment
+@opindex fipa-stack-alignment
+Reduce stack alignment on call sites if possible.
+Enabled by default.
+
 @item -fipa-pta
 @opindex fipa-pta
 Perform interprocedural pointer analysis and interprocedural modification
diff --git a/gcc/final.c b/gcc/final.c
index 6e61f1e17a8..0c1ac625f37 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -4890,7 +4890,8 @@ rest_of_clean_state (void)
   /* We can reduce stack alignment on call site only when we are sure that
      the function body just produced will be actually used in the final
      executable.  */
-  if (decl_binds_to_current_def_p (current_function_decl))
+  if (flag_ipa_stack_alignment
+      && decl_binds_to_current_def_p (current_function_decl))
     {
       unsigned int pref = crtl->preferred_stack_boundary;
       if (crtl->stack_alignment_needed > crtl->preferred_stack_boundary)
diff --git a/gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c b/gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c
new file mode 100644
index 00000000000..1176b59aa5f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-fno-ipa-stack-alignment -O" } */
+
+typedef struct {
+  long a;
+  long b[];
+} c;
+
+c *d;
+void e() { d->b[0] = 5; }
+void f() { e(); }
+
+/* { dg-final { scan-assembler "sub.*%.sp" } } */
-- 
2.19.1

>From 8691490a142228021ed65313a72d176d06966829 Mon Sep 17 00:00:00 2001
From: marxin <mliska@xxxxxxx>
Date: Wed, 7 Nov 2018 13:31:41 +0100
Subject: [PATCH 1/2] Come up with -fipa-reference-addressable flag.

gcc/ChangeLog:

2018-11-07  Martin Liska  <mliska@xxxxxxx>

	* cgraph.h (ipa_discover_readonly_nonaddressable_vars): Rename
	to ...
	(ipa_discover_nonaddressable_vars): ... this.
	* common.opt: Come up with new flag -fipa-reference-addressable.
	* doc/invoke.texi: Document it.
	* ipa-reference.c (propagate): Call the renamed fn.
	* ipa-visibility.c (whole_program_function_and_variable_visibility):
	Likewise.
	* ipa.c (ipa_discover_readonly_nonaddressable_vars): Renamed to
	...
	(ipa_discover_nonaddressable_vars): ... this.  Discove
	non-addressable variables only with the newly added flag.
	* opts.c: Enable the newly added flag with -O1 and higher
	optimization level.

gcc/testsuite/ChangeLog:

2018-11-07  Martin Liska  <mliska@xxxxxxx>

	* gcc.dg/tree-ssa/writeonly-2.c: New test.
---
 gcc/cgraph.h                                |  2 +-
 gcc/common.opt                              |  6 +++++-
 gcc/doc/invoke.texi                         | 10 ++++++++--
 gcc/ipa-reference.c                         |  2 +-
 gcc/ipa-visibility.c                        |  2 +-
 gcc/ipa.c                                   | 11 +++++++----
 gcc/opts.c                                  |  1 +
 gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c | 20 ++++++++++++++++++++
 8 files changed, 44 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index c13d79850fa..bf65d426cda 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -2403,7 +2403,7 @@ void record_references_in_initializer (tree, bool);
 
 /* In ipa.c  */
 void cgraph_build_static_cdtor (char which, tree body, int priority);
-bool ipa_discover_readonly_nonaddressable_vars (void);
+bool ipa_discover_nonaddressable_vars (void);
 
 /* In varpool.c  */
 tree ctor_for_folding (tree);
diff --git a/gcc/common.opt b/gcc/common.opt
index 2971dc21b1f..6a64b0e27d5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1718,7 +1718,11 @@ Perform Identical Code Folding for variables.
 
 fipa-reference
 Common Report Var(flag_ipa_reference) Init(0) Optimization
-Discover readonly and non addressable static variables.
+Discover read-only and non addressable static variables.
+
+fipa-reference-addressable
+Common Report Var(flag_ipa_reference_addressable) Init(0) Optimization
+Discover read-only and write-only addressable variables.
 
 fipa-matrix-reorg
 Common Ignore
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ae260c6ac6d..82c6fa913e8 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,8 +412,8 @@ Objective-C and Objective-C++ Dialects}.
 -finline-functions  -finline-functions-called-once  -finline-limit=@var{n} @gol
 -finline-small-functions  -fipa-cp  -fipa-cp-clone @gol
 -fipa-bit-cp -fipa-vrp @gol
--fipa-pta  -fipa-profile  -fipa-pure-const  -fipa-reference  -fipa-icf @gol
--fira-algorithm=@var{algorithm} @gol
+-fipa-pta  -fipa-profile  -fipa-pure-const  -fipa-reference  -fipa-reference-addressable @gol
+-fipa-icf  -fira-algorithm=@var{algorithm} @gol
 -fira-region=@var{region}  -fira-hoist-pressure @gol
 -fira-loop-pressure  -fno-ira-share-save-slots @gol
 -fno-ira-share-spill-slots @gol
@@ -7866,6 +7866,7 @@ compilation time.
 -fipa-pure-const @gol
 -fipa-profile @gol
 -fipa-reference @gol
+-fipa-reference-addressable @gol
 -fmerge-constants @gol
 -fmove-loop-invariants @gol
 -fomit-frame-pointer @gol
@@ -8895,6 +8896,11 @@ Discover which static variables do not escape the
 compilation unit.
 Enabled by default at @option{-O} and higher.
 
+@item -fipa-reference-addressable
+@opindex fipa-reference-addressable
+Discover read-only and write-only addressable variables.
+Enabled by default at @option{-O} and higher.
+
 @item -fipa-pta
 @opindex fipa-pta
 Perform interprocedural pointer analysis and interprocedural modification
diff --git a/gcc/ipa-reference.c b/gcc/ipa-reference.c
index 43bbdae5d66..2cdce3cbfa6 100644
--- a/gcc/ipa-reference.c
+++ b/gcc/ipa-reference.c
@@ -705,7 +705,7 @@ propagate (void)
   if (dump_file)
     cgraph_node::dump_cgraph (dump_file);
 
-  remove_p = ipa_discover_readonly_nonaddressable_vars ();
+  remove_p = ipa_discover_nonaddressable_vars ();
   generate_summary ();
 
   /* Propagate the local information through the call graph to produce
diff --git a/gcc/ipa-visibility.c b/gcc/ipa-visibility.c
index 000207fa31b..1da594111f8 100644
--- a/gcc/ipa-visibility.c
+++ b/gcc/ipa-visibility.c
@@ -911,7 +911,7 @@ whole_program_function_and_variable_visibility (void)
 {
   function_and_variable_visibility (flag_whole_program);
   if (optimize || in_lto_p)
-    ipa_discover_readonly_nonaddressable_vars ();
+    ipa_discover_nonaddressable_vars ();
   return 0;
 }
 
diff --git a/gcc/ipa.c b/gcc/ipa.c
index 3b6b5e5c8d4..eb53e7dcd06 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -752,10 +752,10 @@ clear_addressable_bit (varpool_node *vnode, void *data ATTRIBUTE_UNUSED)
   return false;
 }
 
-/* Discover variables that have no longer address taken or that are read only
-   and update their flags.
+/* Discover variables that have no longer address taken, are read-only or
+   write-only and update their flags.
 
-   Return true when unreachable symbol removan should be done.
+   Return true when unreachable symbol removal should be done.
 
    FIXME: This can not be done in between gimplify and omp_expand since
    readonly flag plays role on what is shared and what is not.  Currently we do
@@ -764,8 +764,11 @@ clear_addressable_bit (varpool_node *vnode, void *data ATTRIBUTE_UNUSED)
    make sense to do it before early optimizations.  */
 
 bool
-ipa_discover_readonly_nonaddressable_vars (void)
+ipa_discover_nonaddressable_vars (void)
 {
+  if (!flag_ipa_reference_addressable)
+    return false;
+
   bool remove_p = false;
   varpool_node *vnode;
   if (dump_file)
diff --git a/gcc/opts.c b/gcc/opts.c
index 34c283dd765..9b9018c6c48 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -451,6 +451,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion2, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_fipa_pure_const, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_fipa_reference, NULL, 1 },
+    { OPT_LEVELS_1_PLUS, OPT_fipa_reference_addressable, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_fipa_profile, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_fmerge_constants, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_freorder_blocks, NULL, 1 },
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c b/gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c
new file mode 100644
index 00000000000..2272d15b171
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized -fno-ipa-reference-addressable" } */
+static struct a {int magic1,b;} a;
+volatile int magic2;
+static struct b {int a,b,c,d,e,f;} magic3;
+
+struct b foo();
+
+void
+t()
+{
+ a.magic1 = 1;
+ magic2 = 1;
+ magic3 = foo();
+}
+/* { dg-final { scan-tree-dump "magic1" "optimized"} } */
+/* { dg-final { scan-tree-dump "magic3" "optimized"} } */
+/* { dg-final { scan-tree-dump "magic2" "optimized"} } */
+/* { dg-final { scan-tree-dump "foo" "optimized"} } */
+ 
-- 
2.19.1


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux Kernel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux