Re: [PATCH 4.19,5.4,5.10,5.15] perf/core: Fix reentry problem in perf_output_read_group()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 07, 2024 at 02:50:15PM -0300, Thadeu Lima de Souza Cascardo wrote:
> From: Yang Jihong <yangjihong1@xxxxxxxxxx>
> 
> commit 6b959ba22d34ca793ffdb15b5715457c78e38b1a upstream.
> 
> perf_output_read_group may respond to IPI request of other cores and invoke
> __perf_install_in_context function. As a result, hwc configuration is modified.
> causing inconsistency and unexpected consequences.
> 
> Interrupts are not disabled when perf_output_read_group reads PMU counter.
> In this case, IPI request may be received from other cores.
> As a result, PMU configuration is modified and an error occurs when
> reading PMU counter:
> 
> 		     CPU0                                         CPU1
> 						      __se_sys_perf_event_open
> 							perf_install_in_context
>   perf_output_read_group                                  smp_call_function_single
>     for_each_sibling_event(sub, leader) {                   generic_exec_single
>       if ((sub != event) &&                                   remote_function
> 	  (sub->state == PERF_EVENT_STATE_ACTIVE))                    |
>   <enter IPI handler: __perf_install_in_context>   <----RAISE IPI-----+
>   __perf_install_in_context
>     ctx_resched
>       event_sched_out
> 	armpmu_del
> 	  ...
> 	  hwc->idx = -1; // event->hwc.idx is set to -1
>   ...
>   <exit IPI>
> 	      sub->pmu->read(sub);
> 		armpmu_read
> 		  armv8pmu_read_counter
> 		    armv8pmu_read_hw_counter
> 		      int idx = event->hw.idx; // idx = -1
> 		      u64 val = armv8pmu_read_evcntr(idx);
> 			u32 counter = ARMV8_IDX_TO_COUNTER(idx); // invalid counter = 30
> 			read_pmevcntrn(counter) // undefined instruction
> 
> Signed-off-by: Yang Jihong <yangjihong1@xxxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Link: https://lkml.kernel.org/r/20220902082918.179248-1-yangjihong1@xxxxxxxxxx
> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@xxxxxxxxxx>
> ---
> 
> This race may also lead to observed behavior like RCU stalls, hang tasks,
> OOM. Likely due to list corruption or a similar root cause.

Now queued up, thanks.

greg k-h




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux