Re: [PATCH v4 bpf-next 09/14] bpf: Allow reuse from waiting_for_gp_ttrace list.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 7, 2023 at 1:47 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>
> On Fri, Jul 07, 2023 at 09:11:22AM -0700, Alexei Starovoitov wrote:
> > On Thu, Jul 6, 2023 at 9:37 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
> > >
> > > Hi,
> > >
> > > On 7/7/2023 12:16 PM, Alexei Starovoitov wrote:
> > > > On Thu, Jul 6, 2023 at 8:39 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
> > > >> Hi,
> > > >>
> > > >> On 7/7/2023 10:12 AM, Alexei Starovoitov wrote:
> > > >>> On Thu, Jul 6, 2023 at 7:07 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
> > > >>>> Hi,
> > > >>>>
> > > >>>> On 7/6/2023 11:34 AM, Alexei Starovoitov wrote:
> > > >>>>
> > > SNIP
> > > >>> and it's not just waiting_for_gp_ttrace. free_by_rcu_ttrace is similar.
> > > >> I think free_by_rcu_ttrace is different, because the reuse is only
> > > >> possible after one tasks trace RCU grace period as shown below, and the
> > > >> concurrent llist_del_first() must have been completed when the head is
> > > >> reused and re-added into free_by_rcu_ttrace again.
> > > >>
> > > >> // c0->free_by_rcu_ttrace
> > > >> A -> B -> C -> nil
> > > >>
> > > >> P1:
> > > >> alloc_bulk()
> > > >>     llist_del_first(&c->free_by_rcu_ttrace)
> > > >>         entry = A
> > > >>         next = B
> > > >>
> > > >> P2:
> > > >> do_call_rcu_ttrace()
> > > >>     // c->free_by_rcu_ttrace->first = NULL
> > > >>     llist_del_all(&c->free_by_rcu_ttrace)
> > > >>         move to c->waiting_for_gp_ttrace
> > > >>
> > > >> P1:
> > > >> llist_del_first()
> > > >>     return NULL
> > > >>
> > > >> // A is only reusable after one task trace RCU grace
> > > >> // llist_del_first() must have been completed
> > > > "must have been completed" ?
> > > >
> > > > I guess you're assuming that alloc_bulk() from irq_work
> > > > is running within rcu_tasks_trace critical section,
> > > > so __free_rcu_tasks_trace() callback will execute after
> > > > irq work completed?
> > > > I don't think that's the case.
> > >
> > > Yes. The following is my original thoughts. Correct me if I was wrong:
> > >
> > > 1. llist_del_first() must be running concurrently with llist_del_all().
> > > If llist_del_first() runs after llist_del_all(), it will return NULL
> > > directly.
> > > 2. call_rcu_tasks_trace() must happen after llist_del_all(), else the
> > > elements in free_by_rcu_ttrace will not be freed back to slab.
> > > 3. call_rcu_tasks_trace() will wait for one tasks trace RCU grace period
> > > to call __free_rcu_tasks_trace()
> > > 4. llist_del_first() in running in an context with irq-disabled, so the
> > > tasks trace RCU grace period will wait for the end of llist_del_first()
> > >
> > > It seems you thought step 4) is not true, right ?
> >
> > Yes. I think so. For two reasons:
> >
> > 1.
> > I believe irq disabled region isn't considered equivalent
> > to rcu_read_lock_trace() region.
> >
> > Paul,
> > could you clarify ?
>
> You are correct, Alexei.  Unlike vanilla RCU, RCU Tasks Trace does not
> count irq-disabled regions of code as readers.
>
> But why not just put an rcu_read_lock_trace() and a matching
> rcu_read_unlock_trace() within that irq-disabled region of code?
>
> For completeness, if it were not for CONFIG_TASKS_TRACE_RCU_READ_MB,
> Hou Tao would be correct from a strict current-implementation
> viewpoint.  The reason is that, given the current implementation in
> CONFIG_TASKS_TRACE_RCU_READ_MB=n kernels, a task must either block or
> take an IPI in order for the grace-period machinery to realize that this
> task is done with all prior readers.
>
> However, we need to account for the possibility of IPI-free
> implementations, for example, if the real-time guys decide to start
> making heavy use of BPF sleepable programs.  They would then insist on
> getting rid of those IPIs for CONFIG_PREEMPT_RT=y kernels.  At which
> point, irq-disabled regions of code will absolutely not act as
> RCU tasks trace readers.
>
> Again, why not just put an rcu_read_lock_trace() and a matching
> rcu_read_unlock_trace() within that irq-disabled region of code?

If I remember correctly, the general guidance is to always put an
explicit marker if it is in an RCU-reader, instead of relying on
implementation details. So the suggestion to put the marker instead of
relying on IRQ disabling does align with that.

Thanks.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux