On Sat, Jan 26, 2013 at 7:18 AM, H. Peter Anvin <hpa@xxxxxxxxx> wrote: > On the CPUs Ling is testing on the downsides of -Os probably matter less, in particular since rep movsb works well. > > It is questionable as a generic default, though. So being the person who really pushed for -Os to begin with (I think I$ and instruction decode bandwidth is one of the most fundamental limits to CPU performance), I wouldn't mind it if we reintroduced it. HOWEVER. It wasn't just "rep movs". The thing that killed -Os for me was that it makes it impossible to try to optimize hot code, because -Os seems to throw out branch prediction information. So when you use "likely()" etc to try to teach the compiler to lay out code a certain way so that code that never really gets executed isn't even brought into the I$, -Os then screws it up completely. Of course, maybe newer versions of gcc might not suck so horribly with -Os, I haven't actually tried in a while. [ Just tested. Still does it ] Also, I doubt Ling was testing a SB CPU. Because "rep movb" still sucks pretty bad on SB. What core *is* Ling testing? Haswell? Ugh. We could make it depend on the optimization target. I'd also wish there was some way to just tune gcc -Os to be closer to reasonable. Or make -O2 not do some of the excessive crap it does (it aligns code *much* too much, for example - who cares if you can do it with a single instruction, if that instruction is so long that it uses up half your decode bandwidth?) The problem, of course, is that most -O2 code generation is done assuming hot loops that don't show much if any I$ issues. And the -Os thing is done *purely* for size, not taking any performance into account at all. There's no balanced middle ground, which is what _we_ would want. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html