On Tue, Oct 17, 2017 at 05:03:40PM +0200, Thomas Gleixner wrote: > On Tue, 17 Oct 2017, Ingo Molnar wrote: > > * Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > > On Tue, 17 Oct 2017, Ingo Molnar wrote: > > > > No, please fix performance. > > > > > > You know very well that with the cross release stuff we have to take the > > > performance hit of stack unwinding because we have no idea whether there > > > will show up a new lock relation later or not. And there is not much you > > > can do in that respect. > > > > > > OTOH, the cross release feature unearthed real deadlocks already so it is a > > > valuable debug feature and having an explicit config switch which defaults > > > to N is well worth it. > > > > I disagree, because even if that's correct, the choices are not binary. The > > performance regression was a slowdown of around 7x: lockdep boot overhead on that > > particula system went from +3 seconds to +21 seconds... > > Hmm, I might have missed something, but what I've seen in this thread is: > > > > > Boot time (from "Linux version" to login prompt) had in fact doubled > > > > since 4.13 where it took 17 seconds (with my current config) compared to > > > > the 35 seconds I now see with 4.14-rc4. > > So that's 2x not 7x. On one of my main test machines it's about ~1.4 so I > did not even really notice until this thread came up. Probably I have no > expectations on boot time and performance when lockdep is on :) > > > As a response to the performance regression I haven't seen _any_ attempt to > > measure, profile and generally quantify the performance impact, which would at > > least make it more believable that the overhead cannot be reduced. That really > > makes me worry about the code on a higher level than just whether it can be > > enabled by default or not. > > I did some quick perf top analysis, not in detail though, and what really > dominates with that feature is the unwinder, which needs to be > unconditional due to the nature of the problem. > > I have not spend a huge amount of time to think about ways to improve that, > but I could not come up with anything smart so far. > > The only thing I thought about was making the unwind short and only record > one or two call levels (if at all) instead of following the full call Yes, I think that's the best option I can do. Thank you very much. > chain. That makes it less useful for a quick test, but once you hit a splat > you can enable full depth recording for full analysis. In the full analysis > case performance is the least of your worries. > > > Caring about the performance of debug features very much matters, _especially_ > > when they are expensive. > > I'm not disagreeing. I'm just trying to understand why this is marked > BROKEN where I think it should be marked TOO_EXPENSIVE. > > Thanks, > > tglx -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
![]() |