Re: Regular gcc benchmark runs for sparse-matrix vector multiplication?

Harald Anlauf <anlauf@xxxxxx> · Mon, 17 Dec 2018 21:50:52 +0100

I've created

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88533

which provides a testcase and some performance data.

On 12/17/18 11:16, Richard Biener wrote:
> GCC 9 IL looks saner than the GCC 7/8 one.  Note both compilers
> have bound checks inside the innermost loop.  The main difference
> seems to be in loop header copying where GCC 9 is behaving
> much "better" IMHO.  It would be interesting to see whether
> -fno-tree-ch brings results of the compilers in-line again (even
> if it causes the code to run even more slow).

I tried -fno-tree-ch as suggested, and it brings versions 7-9
in line again.  However, that does not appear to be the most
attractive option.

I'd like to emphasize that -funroll-loops is mostly a good option here.
(The resulting code still doesn't get too close to Intel or PGI, but
that's a different story).

> 
> Richard.
> 
>> Regards
>>
>>         Thomas
> 

Thanks,
Harald