On Fri, Jan 12, 2018 at 11:10 PM, Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx> wrote: > On Fri, Jan 12, 2018 at 10:45:31PM +0100, Arnd Bergmann wrote: >> > I guess you could enable the _x routines whenever you use ubsan? Ubsan >> > will cause much bigger code growth than the handful of insns in those >> > routines? >> >> Right, that could work, too. My patch that Herbert merged intentionally >> used -Os also for non-UBSAN builds because it turned out to >> be much faster (see gcc PR83651), > > "Much"? > > -Os is *slower* with 8.0, 5% faster with 7.2, 4% faster with 7.1, > slower with 7.0 and 6.3. Your numbers, #c1. > > Anf this is the generic code of course, which is slow anyway (not to > mention insecure). Right. I've done some more investigation anyway, starting over with the analysis of the gcc options that change it. I've found now that turning off '-fcode-hoisting' but leaving on the other options I had suspected earlier (-O2 instead of -Os, -ftree-sra, -ftree-pre) also fixes the stack problem, and appears to result in the best performance so far. I need to rerun the whole test matrix, but that seems rather promising, and the result may also help debug what's really happening. Arnd