On Tue, 18 Oct 2016, Luis R. Rodriguez wrote: > On Tue, Oct 18, 2016 at 10:08:44AM +0200, Vegard Nossum wrote: > > > > Vegard, thanks for bringing this to attention! > > Including this hunk for those that were originally not CC'd > on the original patch. > > > diff --git a/include/linux/extarray.h b/include/linux/extarray.h > > new file mode 100644 > > index 0000000..1185abc > > --- /dev/null > > +++ b/include/linux/extarray.h > > @@ -0,0 +1,65 @@ > > +#ifndef LINUX_EXTARRAY_H > > +#define LINUX_EXTARRAY_H > > + > > +#include <linux/compiler.h> > > + > > +/* > > + * A common pattern in the kernel > > This is quite an understatement, as you noted on the cover letter, I've been > reviewing these uses, in particular where such uses are used also for the > linker script for quite some time now. I've devised a generic API for these > uses then to help with making it easier to add new entries, making easier > to avoid typos, and giving us some new features from this effort. This the > linker table and section range APIs [0] [1]. Given my review of all this code in > the kernel I'd say this use is not just common, its a well established practice > by now, across *all* architectures. > > In fact I would not be surprised if this usage and practice has extended far > beyond the kernel by now, and custom firmwares / linker scripts also use this > to mimic the practice on the kernel to take advantage of this old hack to stuff > special code / data structures into special ELF sections. As such, I would not > think this is just an issue for Linux. > > Upon a quick cursory review I can confirm such use is prevalent in Xen as well, > for instance the Xen Security Module framework uses it since eons ago [2]. Its > used elsewhere on the Xen linker script too though... so XSM is one small > example at a *quick* glance. > > [0] https://lkml.kernel.org/r/1471642875-5957-1-git-send-email-mcgrof@xxxxxxxxxx > [1] https://lkml.kernel.org/r/1471642454-5679-1-git-send-email-mcgrof@xxxxxxxxxx > [2] https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=d046f361dc937d8fc179cc2da168f571726cb5a0 > > > + is to put certain objects in a specific > > + * named section and then create variables in the linker script pointing > > + * to the start and the end of this section. These variables are declared > > + * as extern arrays to allow C code to iterate over the list of objects > > Right, sometimes its just a char pointer to represent code, other times it > custom data structure. > > > + * > > + * In C, comparing pointers to objects in two different arrays is undefined > > + * GCC version 7.0 and newer (commit 73447cc5d17) will aggressively optimize > > + * out such comparisons if it can prove that the two pointers point to > > + * different arrays (which is the case when the arrays are declared as two > > + * separate variables). This breaks the typical code used to iterate over > > + * such arrays. > > Eek, thanks, I checked commit 73447cc5d17 on gcc [2] however its commit log > is pretty vague to me and does not seem to justify why exactly this optimization > is done, nor what effort was put in to vet for it, as such I cannot agree or > disagree with it. Logically though, to me these are just pointers, as such, > its not clear to me why gcc can take such optimization to make such assumptions. > > Since I cannot infer any more from this commit, and given how prevalent I > expect this use to be (clearly even outside of Linux) I am inclined to consider > this a gcc bug, which requires at least an opt-in optimization rather than this > being a default. Had anyone reported this ?? > > Richard ? The commit implements a long-standing failure to optimize trivial pointer comparisons that arise for example from libstdc++. PR65686 contains a simple C example: mytype f(struct S *e) { mytype x; if(&x != e->pu) __builtin_memcpy(&x, e->pu, sizeof(unsigned)); return x; } where GCC before the commit could not optimize the &x != e->pu test as trivial false. The commit uses points-to analysis results to simplify pointer equality tests (which is in itself a very good way to expose bugs in points-to analysis -- I've fixed a bunch of them thanks to this ;)). > [2] https://github.com/gcc-mirror/gcc/commit/73447cc5d17 > > > + * > > + * One way to get around this limitation is to force GCC to lose any array > > + * information about the pointers before we compare them. We can use e.g. > > + * OPTIMIZER_HIDE_VAR() for this. > > As far as Linux is concerned though your patch set addresses covering a few > cases, it does not cover all, so while it might help boot x86 on some type of > system, by no means would I expect it suffice to boot all Linux systems. > Additionally, while it may *fix* boot on x86, are we certain the use of > OPTIMIZER_HIDE_VAR() may not *break* on certain platforms ? I would ask such > type of intrusive changes which affect the linker script to go well beyond > just 0-day for testing -- Guenter Roeck was kind enough to let me test my > series for linker table / section ranges on his infrastructure which covers > much architectures and toolchains not addressed by 0-day. I'd expect such type > of testing for these types of changes, which can affect many architectures. > > Additionally you asked for this to series to be considered a stable > patch, if this just fixes x86 boot, but breaks other architecture it would > be a considerable regression. I'd prefer we first determine if we want this > change reverted on gcc, or if it at least can be disabled by default. > I really do expect shit to hit the fan elsewhere, so this work around > doesn't same like the next best step at this point. > > > + * > > + * This file defines a few helpers to deal with declaring and accessing > > + * such linker-script-defined arrays. > > + */ > > + > > + > > +#define DECLARE_EXTARRAY(type, name) \ > > + extern type __start_##name[]; \ > > + extern type __stop_##name[]; \ > > + > > +#define _ext_start(name, tmp) \ > > + ({ \ > > + typeof(*__start_##name) *tmp = __start_##name; \ > > + OPTIMIZER_HIDE_VAR(tmp); \ > > + tmp; \ > > + }) > > + > > +#define ext_start(name) _ext_start(name, __UNIQUE_ID(ext_start_)) > > + > > +#define _ext_end(name, tmp) \ > > + ({ \ > > + typeof(*__stop_##name) *tmp = __stop_##name; \ > > + OPTIMIZER_HIDE_VAR(tmp); \ > > + tmp; \ > > + }) > > + > > +#define ext_end(name) _ext_end(name, __UNIQUE_ID(ext_end_)) > > + > > +#define _ext_size(name, tmp1, tmp2) \ > > + ({ \ > > + typeof(*__start_##name) *tmp1 = __start_##name; \ > > + typeof(*__stop_##name) *tmp2 = __stop_##name; \ > > + OPTIMIZER_HIDE_VAR(tmp1); \ > > + OPTIMIZER_HIDE_VAR(tmp2); \ > > + tmp2 - tmp1; \ > > + }) > > + > > +#define ext_size(name) \ > > + _ext_size(name, __UNIQUE_ID(ext_size1_), __UNIQUE_ID(ext_size2_)) > > + > > +#define ext_for_each(var, name) \ > > + for (var = ext_start(name); var != ext_end(name); var++) > > + > > +#endif > > FWIW, with linker able we'd do this type of "fix" in one place later, > if we wanted it, provided all uses are ported, of course. > > > On 10/17/2016 01:45 PM, Peter Zijlstra wrote: > > > On Mon, Oct 17, 2016 at 01:27:08PM +0200, Vegard Nossum wrote: > > > > On 10/17/2016 11:09 AM, Peter Zijlstra wrote: > > > > > On Mon, Oct 17, 2016 at 11:01:13AM +0200, Jiri Slaby wrote: > > > > > > On the top of that, it's incorrect C according to the standard. > > > > > > > > > > According to the standard non of the kernel has any chance in hell of > > > > > working, so don't pretend you care about that :-) > > > > > > > > I think that's a bit of a false dilemma. It's obviously true that kernel > > > > code does not conform to the standards, but that doesn't mean it's not > > > > something we should strive towards or care about in general. It helps > > > > static analysis tools, compiler diversity, etc. > > > > > > Sure, but this, two separately allocated objects their address should > > > not be compared and therefore... stuff is explicitly relied upon by the > > > kernel in many places. > > > > > > We have workarounds in various places, and this patch adds yet another > > > instance of it. > > > > > > The workaround is simply confusing the compiler enough to have it not do > > > the 'optimization'. But we very much still rely on this 'undefined' > > > behaviour. > > > > > > I think it makes more sense to explicitly allow it than to obfuscate our > > > code and run the risk a future compiler will see through our tricks. > > > > Actually, I think we're all a bit wrong. > > > > It's not comparing the pointers that's undefined behavior, that was my > > bad trying to oversimplify the issue. > > > > Of course, comparing arbitrary (valid) pointers with each other is not > > undefined behavior. The undefined behavior is technically doing iter++ > > past the end of the array, > > What defines the end of the array? > > > that is "creating" a pointer that points outside an array. > > I mean, its just a pointer. > > This is the sort of information that would be useful for the commit log. > > > What gcc does wrong is that it sees us iterating over one array and the > > optimizer concludes that the iterator can never point to the second > > array. > > Which second array? Why does it make this huge assumption ? > > > I'd argue there's no real undefined behaviour happening here. > > Thus, the code _is_ correct and valid C as it stands, it just doesn't do > > what you would expect intuitively. > > People have relied on its functionality for years, if this is going > to change it would be I think a good idea to at least have a strong > justification rather than having us trying to interpret the original > goal of the gcc patch. The original goal of the gcc patch wasn't to break the kernel. The goal was to implement an optimization other compilers do since a long time. > > However, from the linker script's point of view there is no difference > > between one big array and two consecutive arrays of the same type, and > > the fact that the compiler doesn't know the memory layout -- although > > we do. > > In Linux we don't mix/match pointer types, we typically use two char * > pointers for start/end of code, or a data structure pointers for start/ > end of list, that's it. Its not clear to me why gcc believes it is correct > to assume start != end. > > > When we call OPTIMIZER_HIDE_VAR() we're not really "confusing" the > > compiler, it's more like calling a function defined in a different file > > (therefore the compiler can't see into it) that returns a pointer which > > we _know_ is valid because it still points to (the end of) the array. > > Its a hack to work around the optimization, if we are to do this I think > its important we all first understand *why* the original commit was done, > without this - it would seem the current optimization will just go breaking > quite a bit of projects. Your changeset would also require much more work > (or we port things to the linker table / section ranges API, and just do the > fix with those wrappers, as it would mean 1 collateral evolution rather than 2 > -- one for this fix and then one for the linker table API). > > > If we obtain a pointer somehow (be it from a hardware register or from > > the stack of a userspace process), the compiler must assume that it's a > > valid pointer, right? The only reason it didn't do that for the arrays > > was because we had declared the two arrays as separate variables so it > > "knew" it was the case that the iterator initialised to point to one > > array could never point to the second array. > > But why is this invalid C all of a sudden with optimizations ? > > foo.h: > > extern char *start_foo; > extern char *end_foo; > > foo.c: > > #include "foo.h" > > char *str = "hello"; > > char *start_foo; > char *end_foo; > > int main(void) > { > start_foo = str; > end_foo = str; > > return !(start_foo == end_foo); > } Why should this be invalid? Let's look at what is special about the kernel usage. Looking at GCC bug 77964 it is declaring extern struct builtin_fw __start_builtin_fw[]; extern struct builtin_fw __end_builtin_fw[]; which are extern zero-sized arrays. I suppose they are nowhere actually defined but these symbols are created by the linker script only. I can think of adding workarounds to GCC to try catching this special case which would be treating a pointer to a extern object of size zero (so you can't do this in C++ ...) as a pointer to possibly any other global variable (given actual data layout may make the pointer value equal to it). Index: gcc/tree-ssa-structalias.c =================================================================== --- gcc/tree-ssa-structalias.c (revision 241327) +++ gcc/tree-ssa-structalias.c (working copy) @@ -2944,6 +2944,17 @@ get_constraint_for_ssa_var (tree t, vec< DECL_PT_UID (t) = DECL_UID (node->decl); t = node->decl; } + + /* If this is an external zero-sized object consider it to + point to NONLOCAL as well. */ + if (DECL_EXTERNAL (t) + && (! DECL_SIZE (t) || integer_zerop (DECL_SIZE (t)))) + { + cexpr.var = nonlocal_id; + cexpr.type = SCALAR; + cexpr.offset = 0; + results->safe_push (cexpr); + } } vi = get_vi_for_tree (t); Note that any issues exposed by the pointer equality simplification are possible issues in previous GCC with regard to alias analysis. I notice we try to honor symbol interposition when directly comparing __start_builtin_fw != __end_builtin_fw for example. But we certainly do not "honor" interposition for alias analysis for, say extern int a[1]; extern int b[1]; where a store to a[0] is not considered aliasing b[0]. Richard. > > Anyway, IANALL. > > IGINYA - I guess I'm not young anymore, what's IANALL mean? :) > > Luis > > -- Richard Biener <rguenther@xxxxxxx> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html