On Thu, Oct 10, 2019 at 06:06:14PM +0000, Alexei Starovoitov wrote: > On 10/10/19 10:46 AM, Peter Zijlstra wrote: > > All of stack_map_get_build_id_offset() is just disguisting games; I did > > tell you guys how to do lockless vma lookups a few years ago -- and yes, > > that is invasive core mm surgery. But this is just disguisting hacks for > > not wanting to do it right. > > you mean speculative page fault stuff? > That was my hope as well and I offered Laurent all the help to land it. > Yet after a year since we've talked the patches are not any closer > to landing. > Any other 'invasive mm surgery' you have in mind? Indeed that series. It had RCU managed VMAs and lockless VMA lookups, which is exactly what you need here. > > Basically the only semi-sane thing to do with that trainwreck is > > s/in_nmi()/true/ and pray. > > > > On top of that I just hate buildids in general. > > Emotions aside... build_id is useful and used in production. > It's used widely because it solves real problems. AFAIU it solves the problem of you not knowing what version of the binary runs where; which I was hoping your cloud infrastructure thing would actually know already. Anyway, I know what it does, I just don't nessecarily agree it is the right way around that particular problem (also, the way I'm personally affected is that perf-record is dead slow by default due to built-id post processing). And it obviously leads to horrible hacks like the code currently under discussion :/ > This dead lock is from real servers and not from some sanitizer wannabe. If you enable CFS bandwidth control and run this function on the trace_hrtimer_start() tracepoint, you should be able to trigger a real AB-BA lockup. > Hence we need to fix it as cleanly as possible and quickly. > s/in_nmi/true/ is certainly an option. That is the best option; because tracepoints / perf-overflow handlers really should not be taking any locks. > I'm worried about overhead of doing irq_work_queue() all the time. > But I'm not familiar with mechanism enough to justify the concerns. > Would it make sense to do s/in_nmi/irgs_disabled/ instead? irqs_disabled() should work in this particular case because rq->lock (and therefore all it's nested locks) are IRQ-safe.