On Thu, Jul 8, 2021 at 11:04 PM Martin KaFai Lau <kafai@xxxxxx> wrote: > > On Thu, Jul 08, 2021 at 08:52:23PM -0700, Alexei Starovoitov wrote: > > On Thu, Jul 08, 2021 at 06:51:19PM -0700, Martin KaFai Lau wrote: > > > > + > > > > /* Called when map->refcnt goes to zero, either from workqueue or from syscall */ > > > > static void array_map_free(struct bpf_map *map) > > > > { > > > > @@ -382,6 +402,7 @@ static void array_map_free(struct bpf_map *map) > > > > if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY) > > > > bpf_array_free_percpu(array); > > > > > > > > + array_map_free_timers(map); > > > array_map_free() is called when map->refcnt reached 0. > > > By then, map->usercnt should have reached 0 before > > > and array_map_free_timers() should have already been called, > > > so no need to call it here again? The same goes for hashtab. > > > > Not sure it's that simple. > > Currently map->usercnt > 0 check is done for bpf_timer_set_callback only, > > because prog refcnting is what matters to usercnt and map_release_uref scheme. > > bpf_map_init doesn't have this check because there is no circular dependency > > prog->map->timer->prog to worry about. > > So after usercnt reached zero the prog can still do bpf_timer_init. > Ah. right. missed the bpf_timer_init(). > > > I guess we can add usercnt > 0 to bpf_timer_init as well. > > Need to think whether it's enough and the race between atomic64_read(usercnt) > > and atomic64_dec_and_test(usercnt) is addressed the same way as the race > > in set_callback and cancel_and_free. So far looks like it. Hmm. > hmm... right, checking usercnt > 0 seems ok. > When usercnt is 0, it may be better to also error out instead of allocating > a timer that cannot be used. > > I was mostly thinking avoiding changes in map_free could make future map > support a little easier. ok. let me try with usercnt>0 in bpf_timer_init. > > > > > > > > > +static void htab_free_malloced_timers(struct bpf_htab *htab) > > > > +{ > > > > + int i; > > > > + > > > > + rcu_read_lock(); > > > > + for (i = 0; i < htab->n_buckets; i++) { > > > > + struct hlist_nulls_head *head = select_bucket(htab, i); > > > > + struct hlist_nulls_node *n; > > > > + struct htab_elem *l; > > > > + > > > > + hlist_nulls_for_each_entry(l, n, head, hash_node) > > > May be put rcu_read_lock/unlock() in the loop and do a > > > cond_resched() in case the hashtab is large. > Just recalled cond_resched_rcu() may be cleaner, like: > > static void htab_free_malloced_timers(struct bpf_htab *htab) > { > int i; > > rcu_read_lock(); > for (i = 0; i < htab->n_buckets; i++) { > /* ... */ > hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) > check_and_free_timer(htab, l); > cond_resched_rcu(); ahh. I didn't know about this flavor. Will give it a shot. Thanks! > } > rcu_read_unlock(); > } >