Re: ptrlist-iterator performance on one wine source file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 03, 2017 at 12:50:20AM +0200, Luc Van Oostenryck wrote:
> On Wed, Aug 2, 2017 at 3:17 AM, Christopher Li <sparse@xxxxxxxxxxx> wrote:
> > On Tue, Aug 1, 2017 at 5:46 PM, Luc Van Oostenryck
> > <luc.vanoostenryck@xxxxxxxxx> wrote:
> >> has any effects for sparse and the only effect of -O2 is
> >> to define __OPTIMIZE__.
> >
> > Yes, indeed. It is the -O2 make the difference.
> >
> > I recently upgrade to Fedora 26. I think it is the system header file
> > making a difference
> > on the __OPTIMIZE__. I attach two files here. O2.c is the one with -O2 flag
> > after processor. The 0.c is the one without.
> >
> > There is huge difference in them.
> >
> > I confirm with O2.c I am seeing the 24 second delay. And 0.c 4 seconds.
> >
> > I attach the two files with gzip.
> 
> It seems that the email was rejected on the mailing list.
> 
> > I think you should be able to reproduce it with O2.c now.
> 
> Yes, I can reproduce it now.
> The differences in the two file are not big. Basically, in -O2 there is
> - a bunch of small functions which have now an inline definition
> - a set of strcmp(winetest_platform, "wine") which are replaced
>   by a macro of hell.
> It's, of course, the last one which creates the problem.
> The macro seems to try to optimize the compare using the fact
> that the compiler will statically evaluate things like:
> - strlen("wine")
> - "wine"[0], "wine"[1], ...
> but sparse doesn't do this kind of simplification (yet) and this result
> in much much more code.
> 
> A first inspection of the generated code doesn't show anything
> obviously wrong but I don't exclude there is another problem.

I looked a bit more at this and the problem is not really because of this
evaluation sparse doesn't do. Also I work on a simplified version of the
files keeping nothing after test_ScriptItemize().

1) some numbers:
- GCC compile both preprocessed files in .9s with -O2.
- sparse check the O0 file in 1.93s and O2 file in 13s.
Thus even on the O0 file, the time is already too high because generaly
sparse is roughly 10 times faster than gcc -O2, here is twice as slow.

sparse emits roughly 5 times more BBs for the O2 file than for the O0 one,
it also emits roughly 5 times more instructions and take roughly 5 more
time to generate them (ok, 7 times more).
Thus the processing time and the number of instructions scale well
with the number of BBs emitted (and these BBs are emitted before any
simplification are made), so there is no sign of any non-linearity
nor any oddities with simplification/optimization that would run crazy.

2) I don't think that the lack of static evaluation of strlen("wine") or
   "wine"[0] is a problem here because where present this code is preceded
   by a test __builtin_constant_p(winetest_platform) wich fail.
   All this code is then quickly optimized away (but have first been
   generated as linearized code).

3) the situation with the macro from hell is even worse than I thought:
   it's present 7 times but is in a inline function which is itself
   used 292 times. Thus this code is present 2044 times!
   Some tests show that the time is (roughly) proportional to the
   number of time this inline function is used.

4) if we replace 'inline' by 'inline __attribute__((always_inline))'
   GCC needs roughly 58s to compile the O0 or O2 file.

My conclusion is that most of the problem here comes from the fact that:
   - sparse always inlines function marked as inline
   - and does so very early, before any optimizations (so the extra macro
     code is inlined 2044 times and need to be processed 2044 times).
GCC needs the same time for both file because it inlines functions after
a first optimization pass (I'm guessing here).

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Newbies FAQ]     [LKML]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Trinity Fuzzer Tool]

  Powered by Linux