Re: [PATCH] bpf: fix rq lock recursion issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 23, 2022 at 11:56 PM Satya Durga Srinivasu Prabhala
<quic_satyap@xxxxxxxxxxx> wrote:
>
>
> On 6/13/22 11:09 PM, Yonghong Song wrote:
> >
> >
> > On 6/13/22 6:10 PM, Satya Durga Srinivasu Prabhala wrote:
> >>
> >> On 6/13/22 2:49 PM, Alexei Starovoitov wrote:
> >>> On Mon, Jun 13, 2022 at 2:35 PM Satya Durga Srinivasu Prabhala
> >>> <quic_satyap@xxxxxxxxxxx> wrote:
> >>>>
> >>>> On 6/13/22 2:01 PM, Alexei Starovoitov wrote:
> >>>>> is doesn't solve anything.
> >>>>> Please provide a reproducer.
> >>>> I'm trying to find an easy way to repro the issue, so far,
> >>>> unsuccessful.
> >>>>
> >>>>> iirc the task's affinity change can race even with preemption
> >>>>> disabled
> >>>>> on this cpu. Why would s/migrate/preemption/ address the deadlock ?
> >>>> I don't think task's affinity change races with preemption
> >>>> disabled/enabled.
> >>>>
> >>>> Switching to preemption disable/enable calls helps as it's just simple
> >>>> counter increment and decrement with a barrier, but with migrate
> >>>> disable/enable when task's affinity changes, we run into recursive bug
> >>>> due to rq lock.
> >>> As Yonghong already explained, replacing migrate_disable
> >>> with preempt_disable around bpf prog invocation is not an option.
> >>
> >> If I understand correctly, Yonghong mentioned that replacing migrate_
> >> with preempt_ won't work for RT Kernels and migrate_ APIs were
> >> introduced
> >> for RT Kernels is what he was pointing to. I went back and cross checked
> >> on 5.10 LTS Kernel, I see that the migrate_ calls end up just calling
> >> into
> >> preemt_ calls [1]. So far non-RT kernels, sticking to preemt_ calls
> >> should
> >> still work. Please let me know if I missed anything.
> >
> > Yes, old kernel migrate_disable/enable() implementation with
> > simply preempt_disable/enable() are transitional. You can check
> > 5.12 kernel migrate_disable/enable() implementation. Note that
> > your patch, if accepted, will apply to the latest kernel. So we
> > cannot simply replace migrate_disable() with prempt_disable(),
> > which won't work for RT kernel.
>
> Thanks for getting back and apologies for the delay. I understand that
> we may break RT kernels with this patch. So, I was proposing to stick to
> migrate_disable/enable() calls on RT Kernels and use preempt_disable/enable()
> in case of non-RT Kernel. Which warrants change in scheduler, will push
> patch and try get feedback from Scheduler experts.

We will stay with migrate_disable/enable on all types of kernel.
This is not negotiable.

> While I'm here I would like to cross check with you xperts on ideas to
> reproduce the issue easily and consistently. Your inputs will immensely help to
> debug issue further.

Please do. No patches will be applied until there is a clear reproducer.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux