On Tue, Sep 15, 2020 at 11:56 AM Yonghong Song <yhs@xxxxxx> wrote: > > > > On 9/15/20 10:40 AM, Jakub Kicinski wrote: > > On Tue, 15 Sep 2020 10:35:50 -0700 Yonghong Song wrote: > >> On 9/15/20 8:33 AM, Jakub Kicinski wrote: > >>> On Mon, 14 Sep 2020 11:46:30 -0700 Yonghong Song wrote: > >>>> Currently, we use bucket_lock when traversing bpf_sk_storage_map > >>>> elements. Since bpf_iter programs cannot use bpf_sk_storage_get() > >>>> and bpf_sk_storage_delete() helpers which may also grab bucket lock, > >>>> we do not have a deadlock issue which exists for hashmap when > >>>> using bucket_lock ([1]). > >>>> > >>>> If a bucket contains a lot of sockets, during bpf_iter traversing > >>>> a bucket, concurrent bpf_sk_storage_{get,delete}() may experience > >>>> some undesirable delays. Using rcu_read_lock() is a reasonable > >>>> compromise here. Although it may lose some precision, e.g., > >>>> access stale sockets, but it will not hurt performance of other > >>>> bpf programs. > >>>> > >>>> [1] https://lore.kernel.org/bpf/20200902235341.2001534-1-yhs@xxxxxx > >>>> > >>>> Cc: Martin KaFai Lau <kafai@xxxxxx> > >>>> Signed-off-by: Yonghong Song <yhs@xxxxxx> > >>> > >>> Sparse is not happy about it. Could you add some annotations, perhaps? > >>> > >>> include/linux/rcupdate.h:686:9: warning: context imbalance in 'bpf_sk_storage_map_seq_find_next' - unexpected unlock > >>> include/linux/rcupdate.h:686:9: warning: context imbalance in 'bpf_sk_storage_map_seq_stop' - unexpected unlock > >> > >> Okay, I will try. > >> > >> On my system, sparse is unhappy and core dumped.... > >> > >> /data/users/yhs/work/net-next/include/linux/string.h:12:38: error: too > >> many errors > >> /bin/sh: line 1: 2710132 Segmentation fault (core dumped) sparse > >> -D__linux__ -Dlinux -D__STDC__ -Dunix > >> -D__unix__ -Wbitwise -Wno-return-void -Wno-unknown-attribute > >> -D__x86_64__ --arch=x86 -mlittle-endian -m64 -W > >> p,-MMD,net/core/.bpf_sk_storage.o.d -nostdinc -isystem > >> ... > >> /data/users/yhs/work/net-next/net/core/bpf_sk_storage.c > >> make[3]: *** [net/core/bpf_sk_storage.o] Error 139 > >> make[3]: *** Deleting file `net/core/bpf_sk_storage.o' > >> > >> -bash-4.4$ rpm -qf /bin/sparse > >> sparse-0.5.2-1.el7.x86_64 > >> -bash-4.4$ > > > > I think you need to build from source, sadly :( > > > > https://git.kernel.org/pub/scm//devel/sparse/sparse.git > > Indeed, building sparse from source works. After adding some > __releases(RCU) and __acquires(RCU), I now have: > context imbalance in 'bpf_sk_storage_map_seq_find_next' - different > lock contexts for basic block > I may need to restructure code to please sparse... I don't think sparse can handle such things even with all annotations. I would spend too much time on it.