On Wed, Nov 02, 2022 at 02:02:49PM +0000, Leonid Ravich wrote: > > > > before starting throwing some patch into the the air I would like to align with you the approach we should take here. > > > > > > > > my suggestion here : > > > >- ftrace infra should verify no migration happen (end and start happens on same CPU) in case not we will throw warning for the issue . > > > > > >The scheduler should have. On entering the ring buffer code > > >ring_buffer_lock_reserver() it disables preemption and does not > > >re-enable it until ring_buffer_unlock_commit(). > > > > > >The only way to migrate is if you re-enable preemption. WHICH IS A > > >BUG! > > >So what on earth did that? > > >I'm guessing some driver's query_pkey op, but AFAIK we don't have any > >explicit pre-emption reenablements in the code - unless it is sneaky.. > trace infra uses preempt_disable_notrace/preempt_enable_notrace to disable/enable preemtion but my kernel compiled without CONFIG_PREEMPTION so this functions are only barriers - looks like the idea behind was to avoid involuntary preemtion but in our case it is a voluntary (there is a wait_for_completion in the query_pkey rabbit hole). So this tracepoint is just wrong, you can't call a sleepable function from a tracepoint like that? Presumably lockdep would/should warn about this? Delete the pkey logging from the tracepoint, it can't work, I guess. Jason