On 04/10/2019 07:44 PM, Sebastian Andrzej Siewior wrote: > On 2019-04-10 19:19:27 [+0200], Daniel Borkmann wrote: >> On 04/10/2019 07:08 PM, Yonghong Song wrote: >>> On 4/10/19 7:30 AM, Sebastian Andrzej Siewior wrote: >>>> There is no difference between spinlock_t and raw_spinlock_t for !RT >>>> kernels. It is possible that bpf_lru_list_pop_free_to_local() invokes >>>> at least three list walks which look unbounded it probably makes sense >>>> to use spinlock_t. >>>> >>>> Make bpf_lru_list use a spinlock_t. >>> >>> Could you add a cover letter for the patch set since you have >>> more than one patch? > > yes. > >>>> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> >>>> --- >>>> kernel/bpf/bpf_lru_list.c | 38 +++++++++++++++++++------------------- >>>> kernel/bpf/bpf_lru_list.h | 4 ++-- >>>> 2 files changed, 21 insertions(+), 21 deletions(-) >>>> >>>> diff --git a/kernel/bpf/bpf_lru_list.c b/kernel/bpf/bpf_lru_list.c >>>> index e6ef4401a1380..40f47210c3817 100644 >>>> --- a/kernel/bpf/bpf_lru_list.c >>>> +++ b/kernel/bpf/bpf_lru_list.c >>>> @@ -313,9 +313,9 @@ static void bpf_lru_list_push_free(struct bpf_lru_list *l, >>>> if (WARN_ON_ONCE(IS_LOCAL_LIST_TYPE(node->type))) >>>> return; >>>> >>>> - raw_spin_lock_irqsave(&l->lock, flags); >>>> + spin_lock_irqsave(&l->lock, flags); >>>> __bpf_lru_node_move(l, node, BPF_LRU_LIST_T_FREE); >>>> - raw_spin_unlock_irqrestore(&l->lock, flags); >>>> + spin_unlock_irqrestore(&l->lock, flags); >>>> } >>> >>> This function (plus many others) is called by bpf program helpers which >>> cannot sleep. Is it possible that under RT spin_lock_irqsave() could >>> sleep and this will make bpf subsystem cannot be used under RT? >> >> Back then in ac00881f9221 ("bpf: convert hashtab lock to raw lock") >> Yang Shi converted spinlock_t to raw_spinlock_t due to exactly the >> above mentioned issue. I presume that hasn't changed, right? > > Ah. I checked one or two of those and it looked like it was raw since > the beginning. Anyway, it would be nice to Cc: the RT developer while The later ones like form LRU may have just been adapted to use the same after the conversion from ac00881f9221, but I presume there might unfortunately be little to no testing on RT kernels currently. Hopefully we'll get there such that at min bots would yell if something is off so it can be fixed. > fiddling with something that only concerns RT. I think that was the case back then, see discussion / Cc list: https://lore.kernel.org/netdev/1446243386-26582-1-git-send-email-yang.shi@xxxxxxxxxx/T/ > That usage pattern that is mentioned in ac00881f9221, is it true for all > data structure algorithms? In bpf_lru_list I was concerned about the > list loops. However hashtab and lpm_trie may perform memory allocations > while holding the lock and this isn't going to work. Do you have some document or guide for such typical patterns and how to make them behave better for RT? Thanks, Daniel