On 4/10/19 10:44 AM, Sebastian Andrzej Siewior wrote: > On 2019-04-10 19:19:27 [+0200], Daniel Borkmann wrote: >> On 04/10/2019 07:08 PM, Yonghong Song wrote: >>> On 4/10/19 7:30 AM, Sebastian Andrzej Siewior wrote: >>>> There is no difference between spinlock_t and raw_spinlock_t for !RT >>>> kernels. It is possible that bpf_lru_list_pop_free_to_local() invokes >>>> at least three list walks which look unbounded it probably makes sense >>>> to use spinlock_t. >>>> >>>> Make bpf_lru_list use a spinlock_t. >>> >>> Could you add a cover letter for the patch set since you have >>> more than one patch? > > yes. > >>>> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> >>>> --- >>>> kernel/bpf/bpf_lru_list.c | 38 +++++++++++++++++++------------------- >>>> kernel/bpf/bpf_lru_list.h | 4 ++-- >>>> 2 files changed, 21 insertions(+), 21 deletions(-) >>>> >>>> diff --git a/kernel/bpf/bpf_lru_list.c b/kernel/bpf/bpf_lru_list.c >>>> index e6ef4401a1380..40f47210c3817 100644 >>>> --- a/kernel/bpf/bpf_lru_list.c >>>> +++ b/kernel/bpf/bpf_lru_list.c >>>> @@ -313,9 +313,9 @@ static void bpf_lru_list_push_free(struct bpf_lru_list *l, >>>> if (WARN_ON_ONCE(IS_LOCAL_LIST_TYPE(node->type))) >>>> return; >>>> >>>> - raw_spin_lock_irqsave(&l->lock, flags); >>>> + spin_lock_irqsave(&l->lock, flags); >>>> __bpf_lru_node_move(l, node, BPF_LRU_LIST_T_FREE); >>>> - raw_spin_unlock_irqrestore(&l->lock, flags); >>>> + spin_unlock_irqrestore(&l->lock, flags); >>>> } >>> >>> This function (plus many others) is called by bpf program helpers which >>> cannot sleep. Is it possible that under RT spin_lock_irqsave() could >>> sleep and this will make bpf subsystem cannot be used under RT? >> >> Back then in ac00881f9221 ("bpf: convert hashtab lock to raw lock") >> Yang Shi converted spinlock_t to raw_spinlock_t due to exactly the >> above mentioned issue. I presume that hasn't changed, right? > > Ah. I checked one or two of those and it looked like it was raw since > the beginning. Anyway, it would be nice to Cc: the RT developer while > fiddling with something that only concerns RT. > > That usage pattern that is mentioned in ac00881f9221, is it true for all > data structure algorithms? In bpf_lru_list I was concerned about the > list loops. For some skewed hash tables with a lot of elements, yes, it is possible time to traverse the loop could be a little bit longer... > However hashtab and lpm_trie may perform memory allocations > while holding the lock and this isn't going to work. The memory allocation in these two places uses GFP_ATOMIC which should not sleep/block. > > Sebastian >