> On Sep 19, 2023, at 5:38 PM, Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > On 5/19/23 3:51 PM, Aditi Ghag wrote: >> +static struct sock *bpf_iter_udp_batch(struct seq_file *seq) >> +{ >> + struct bpf_udp_iter_state *iter = seq->private; >> + struct udp_iter_state *state = &iter->state; >> + struct net *net = seq_file_net(seq); >> + struct udp_table *udptable; >> + unsigned int batch_sks = 0; >> + bool resized = false; >> + struct sock *sk; >> + >> + /* The current batch is done, so advance the bucket. */ >> + if (iter->st_bucket_done) { >> + state->bucket++; >> + iter->offset = 0; >> + } >> + >> + udptable = udp_get_table_seq(seq, net); >> + >> +again: >> + /* New batch for the next bucket. >> + * Iterate over the hash table to find a bucket with sockets matching >> + * the iterator attributes, and return the first matching socket from >> + * the bucket. The remaining matched sockets from the bucket are batched >> + * before releasing the bucket lock. This allows BPF programs that are >> + * called in seq_show to acquire the bucket lock if needed. >> + */ >> + iter->cur_sk = 0; >> + iter->end_sk = 0; >> + iter->st_bucket_done = false; >> + batch_sks = 0; >> + >> + for (; state->bucket <= udptable->mask; state->bucket++) { >> + struct udp_hslot *hslot2 = &udptable->hash2[state->bucket]; >> + >> + if (hlist_empty(&hslot2->head)) { >> + iter->offset = 0; >> + continue; >> + } >> + >> + spin_lock_bh(&hslot2->lock); >> + udp_portaddr_for_each_entry(sk, &hslot2->head) { >> + if (seq_sk_match(seq, sk)) { >> + /* Resume from the last iterated socket at the >> + * offset in the bucket before iterator was stopped. >> + */ >> + if (iter->offset) { >> + --iter->offset; > > Hi Aditi, I think this part has a bug. > > When I run './test_progs -t bpf_iter/udp6' in a machine with some udp so_reuseport sockets, this test is never finished. > > A broken case I am seeing is when the bucket has >1 sockets and bpf_seq_read() can only get one sk at a time before it calls bpf_iter_udp_seq_stop(). Just so that I understand the broken case better, are you doing something in your BPF iterator program so that "bpf_seq_read() can only get one sk at a time"? > > I did not try the change yet. However, from looking at the code where iter->offset is changed, --iter->offset here is the most likely culprit and it will make backward progress for the same bucket (state->bucket). Other places touching iter->offset look fine. > > It needs a local "int offset" variable for the zero test. Could you help to take a look, add (or modify) a test and fix it? > > The progs/bpf_iter_udp[46].c test can be used to reproduce. The test_udp[46] in prog_tests/bpf_iter.c needs to be changed though to ensure there is multiple sk in the same bucket. Probably a few so_reuseport sk should do. The sock_destroy patch set had added a test with multiple so_reuseport sks in a bucket in order to exercise batching [1]. I was wondering if extending the test with an additional bucket should do it, or some more cases are required (asked for clarification above) to reproduce the issue. [1] https://elixir.bootlin.com/linux/v6.5/source/tools/testing/selftests/bpf/prog_tests/sock_destroy.c#L146 > > Thanks. > >> + continue; >> + } >> + if (iter->end_ >