Re: 'perf test sigtrap' failing on PREEMPT_RT_FULL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 06, 2024 at 10:06:30AM -0300, Arnaldo Carvalho de Melo wrote:
> > In Thu, 4 Jan 2024 19:35:57 -0300, Arnaldo Carvalho de Melo wrote:
> > > +++ b/kernel/events/core.c
> > > @@ -6801,7 +6801,7 @@ static void perf_pending_task(struct callback_head *head)
> > >         * If we 'fail' here, that's OK, it means recursion is already disabled
> > >         * and we won't recurse 'further'.
> > >         */
> > >-       preempt_disable_notrace();
> > >+       migrate_disable();
> > >        rctx = perf_swevent_get_recursion_context();
>  
> > Pardon my ignorance, is it safe to call preempt_count() with preemption
> > enabled on PREEMPT_RT, or at least in the context being discussed here?
>  
> > Because:
>  
> > 	 perf_swevent_get_recursion_context()
> > 	     get_recursion_context()
> >                  interrupt_context_level()
> >                      preempt_count()	
>  
> > And:
>  
> > int perf_swevent_get_recursion_context(void)
> > {
> >         struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
> > 
> >         return get_recursion_context(swhash->recursion);
> > }
> 
> Seems to be enough because perf_pending_task is a irq_work callback and

s/irq_work/task_work/ but that also doesn't reentry, I think

> that is guaranteed not to reentry?
> 
> Artem's tests with a RHEL kernel seems to indicate that, ditto for my,
> will test with upstream linux-6.8.y-rt.
> 
> But there is a lot more happening in perf_sigtrap and I'm not sure if
> the irq_work callback gets preempted we would not race with something
> else.
> 
> Marco, Mike, ideas?

Looking at:

commit ca6c21327c6af02b7eec31ce4b9a740a18c6c13f
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date:   Thu Oct 6 15:00:39 2022 +0200

    perf: Fix missing SIGTRAPs
    
    Marco reported:
    
    Due to the implementation of how SIGTRAP are delivered if
    perf_event_attr::sigtrap is set, we've noticed 3 issues:
    
      1. Missing SIGTRAP due to a race with event_sched_out() (more
         details below).
    
      2. Hardware PMU events being disabled due to returning 1 from
         perf_event_overflow(). The only way to re-enable the event is
         for user space to first "properly" disable the event and then
         re-enable it.
    
      3. The inability to automatically disable an event after a
         specified number of overflows via PERF_EVENT_IOC_REFRESH.
    
    The worst of the 3 issues is problem (1), which occurs when a
    pending_disable is "consumed" by a racing event_sched_out(), observed
    as follows:

-------------------------------------------------------------

That its what introduces perf_pending_task(), I'm now unsure we can just
disable migration, as event_sched_out() seems to require being called
under a raw_spin_lock and that disables preemption...

- Arnaldo




[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux