Jakub Sitnicki wrote: > On Thu, Feb 06, 2020 at 08:43 PM CET, John Fastabend wrote: > > Jakub Sitnicki wrote: > >> Couple of fixes that came from recent discussion [0] on commit > >> 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down"). > >> > >> This series doesn't address the sleeping while holding a spinlock > >> problem. We're still trying to decide how to fix that [1]. > >> > >> Until then sockmap users might see the following warnings: > >> > >> | BUG: sleeping function called from invalid context at net/core/sock.c:2935 > [...] > Hey John, Patch sent. > > > Untested at the moment, but this should also be fine per your suggestion > > (if I read it correctly). The reason we have stab->lock and bucket->locks > > here is to handle checking EEXIST in update/delete cases. We need to > > be careful that when an update happens and we check for EEXIST that the > > socket is added/removed during this check. So both map_update_common and > > sock_map_delete need to guard from being run together potentially deleting > > an entry we are checking, etc. > > Okay, thanks for explanation. IOW, we're serializing map writers. > > > But by the time we get here we just did a synchronize_rcu() in the > > line above so no updates/deletes should be in flight. So it seems safe > > to drop these locks because of the condition no updates in flight. > > This part is not clear to me. I might be missing something. > > Here's my thinking - for any map writes (update/delete) to start, > map->refcnt needs to be > 0, and the ref is not dropped until the write > operation has finished. > > Map FDs hold a ref to map until the FD gets released. And BPF progs hold > refs to maps until the prog gets unloaded. > > This would mean that map_free will get scheduled from __bpf_map_put only > when no one is holding a map ref, and could start a write that would be > happening concurrently with sock_{map,hash}_free: Sorry bringing back this old thread I'm not sure I followed the couple paragraphs here. Is this with regards to the lock or the rcu? II didn't want to just drop this thanks. We can't have new updates/lookups/deletes happening while we are free'ing a map that would cause all sorts of problems, use after free's, etc. > > /* decrement map refcnt and schedule it for freeing via workqueue > * (unrelying map implementation ops->map_free() might sleep) > */ > static void __bpf_map_put(struct bpf_map *map, bool do_idr_lock) > { > if (atomic64_dec_and_test(&map->refcnt)) { > /* bpf_map_free_id() must be called first */ > bpf_map_free_id(map, do_idr_lock); > btf_put(map->btf); > INIT_WORK(&map->work, bpf_map_free_deferred); > schedule_work(&map->work); > } > } > > > So with patch below we keep the sync rcu but that is fine IMO these > > map free's are rare. Take a look and make sure it seems sane to you > > as well. > > I can't vouch for the need to keep synchronize_rcu here because I don't > understand that part, but otherwise the change LGTM. > > -jkbs > [...]