Hi, On 7/7/2023 12:16 PM, Alexei Starovoitov wrote: > On Thu, Jul 6, 2023 at 8:39 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: >> Hi, >> >> On 7/7/2023 10:12 AM, Alexei Starovoitov wrote: >>> On Thu, Jul 6, 2023 at 7:07 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: >>>> Hi, >>>> >>>> On 7/6/2023 11:34 AM, Alexei Starovoitov wrote: >>>> SNIP >>> and it's not just waiting_for_gp_ttrace. free_by_rcu_ttrace is similar. >> I think free_by_rcu_ttrace is different, because the reuse is only >> possible after one tasks trace RCU grace period as shown below, and the >> concurrent llist_del_first() must have been completed when the head is >> reused and re-added into free_by_rcu_ttrace again. >> >> // c0->free_by_rcu_ttrace >> A -> B -> C -> nil >> >> P1: >> alloc_bulk() >> llist_del_first(&c->free_by_rcu_ttrace) >> entry = A >> next = B >> >> P2: >> do_call_rcu_ttrace() >> // c->free_by_rcu_ttrace->first = NULL >> llist_del_all(&c->free_by_rcu_ttrace) >> move to c->waiting_for_gp_ttrace >> >> P1: >> llist_del_first() >> return NULL >> >> // A is only reusable after one task trace RCU grace >> // llist_del_first() must have been completed > "must have been completed" ? > > I guess you're assuming that alloc_bulk() from irq_work > is running within rcu_tasks_trace critical section, > so __free_rcu_tasks_trace() callback will execute after > irq work completed? > I don't think that's the case. Yes. The following is my original thoughts. Correct me if I was wrong: 1. llist_del_first() must be running concurrently with llist_del_all(). If llist_del_first() runs after llist_del_all(), it will return NULL directly. 2. call_rcu_tasks_trace() must happen after llist_del_all(), else the elements in free_by_rcu_ttrace will not be freed back to slab. 3. call_rcu_tasks_trace() will wait for one tasks trace RCU grace period to call __free_rcu_tasks_trace() 4. llist_del_first() in running in an context with irq-disabled, so the tasks trace RCU grace period will wait for the end of llist_del_first() It seems you thought step 4) is not true, right ? > In vCPU P1 is stopped for looong time by host, > P2 can execute __free_rcu_tasks_trace (or P3, since > tasks trace callbacks execute in a kthread that is not bound > to any cpu). > __free_rcu_tasks_trace() will free it into slab. > Then kmalloc the same obj and eventually put it back into > free_by_rcu_ttrace. > > Since you believe that waiting_for_gp_ttrace ABA is possible > here it's the same probability. imo both lower than a bit flip due > to cosmic rays which is actually observable in practice. > >> __free_rcu_tasks_trace >> free_all(llist_del_all(&c->waiting_for_gp_ttrace)) >> >> > .