On Fri, Aug 19, 2022 at 04:31:11PM +0200, Kumar Kartikeya Dwivedi wrote: > On Fri, 19 Aug 2022 at 00:30, Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > Right. We cannot fail in unit_free(). > > With per-cpu counter both unit_alloc() and free_bulk_nmi() would > > potentially fail in such unlikely scenario. > > Not a big deal for free_bulk_nmi(). It would pick the element later. > > For unit_alloc() return NULL is normal. > > Especially since it's so unlikely for nmi to hit right in the middle > > of llist_del_first(). > > > > Since we'll add this per-cpu counter to solve interrupted llist_del_first() > > it feels that the same counter can be used to protect unit_alloc/free/irq_work. > > Then we don't need free_llist_nmi. Single free_llist would be fine, > > but unit_free() should not fail. If free_list cannot be accessed > > due to per-cpu counter being busy we have to push somewhere. > > So it seems two lists are necessary. Maybe it's still better ? > > Roughly I'm thinking of the following: > > unit_alloc() > > { > > llnode = NULL; > > local_irq_save(); > > if (__this_cpu_inc_return(c->alloc_active) != 1)) > > goto out; > > llnode = __llist_del_first(&c->free_llist); > > if (llnode) > > cnt = --c->free_cnt; > > out: > > __this_cpu_dec(c->alloc_active); > > local_irq_restore(); > > return ret; > > } > > unit_free() > > { > > local_irq_save(); > > if (__this_cpu_inc_return(c->alloc_active) != 1)) { > > llist_add(llnode, &c->free_llist_nmi); > > goto out; > > } > > __llist_add(llnode, &c->free_llist); > > cnt = ++c->free_cnt; > > out: > > __this_cpu_dec(c->alloc_active); > > local_irq_restore(); > > return ret; > > } > > alloc_bulk, free_bulk would be protected by alloc_active as well. > > alloc_bulk_nmi is gone. > > free_bulk_nmi is still there to drain unlucky unit_free, > > but it's now alone to do llist_del_first() and it just frees anything > > that is in the free_llist_nmi. > > The main advantage is that free_llist_nmi doesn't need to prefilled. > > It will be empty most of the time. > > wdyt? > > Looks great! The other option would be to not have the overflow > free_llist_nmi list and just allowing llist_add to free_llist from the > NMI case even if we interrupt llist_del_first, but then the non-NMI > user needs to use the atomic llist_add version as well (since we may > interrupt it), not only llist_add, but unit_alloc would have to use atomic llist_del_first too. So any operation on the list would have to be with cmpxchg. > which won't be great for performance. exactly. > So having the > extra list is much better. yep. same thinking. I'll refactor the patches and send v3 with this approach.