On 2/28/23 6:40 PM, Aditi Ghag wrote:
+
+ if (hlist_empty(&hslot->head))
+ continue;
+
+ spin_lock_bh(&hslot->lock);
+ sk_for_each(sk, &hslot->head) {
+ if (seq_sk_match(seq, sk)) {
+ if (first) {
+ first_sk = sk;
+ first = false;
+ }
+ if (iter->end_sk < iter->max_sk) {
+ sock_hold(sk);
+ iter->batch[iter->end_sk++] = sk;
+ }
+ bucket_sks++;
+ }
+ }
+ spin_unlock_bh(&hslot->lock);
+ if (first_sk)
+ break;
+ }
+
+ /* All done: no batch made. */
+ if (!first_sk)
+ return NULL;
I think first_sk and bucket_sks need to be reset on the "again" case also?
If bpf_iter_udp_seq_stop() is called before a batch has been fully processed by the bpf prog in ".show", how does the next bpf_iter_udp_seq_start() continue from where it left off? The bpf_tcp_iter remembers the bucket and the offset-in-this-bucket. I think bpf_udp_iter can do something similar.
Hmm, I suppose you are referring to the `tcp_seek_last_pos` logic? This was the [1] commit that added the optimization in v2.6, but only for TCP. Extending the same logic for UDP is out of the scope of this PR? Although reading the [1] commit description, the latency issue seems to be specific to the file based iterators, no? Of course, BPF iterators are quite new, and I'm not sure if we have the same "slowness" reported in that commit. Having said that, do we expect users to start from where they previously left off by stopping BPF iterators? Regardless, I'll do some testing to ensure that we at least don't crash.
It is about ensuring the bpf-udp-iter makes forward progress, although speeding
up is another plus. The iteration may have to be stopped (ie.
bpf_iter_udp_seq_stop) for other reasons. The bpf-(udp|tcp|unix)-iter is not
only for doing bpf_setsockopt or bpf_sock_destroy. A major use case is to
bpf_seq_printf. eg. when seq_has_overflowed() in bpf_seq_read,
bpf_iter_udp_seq_stop will be called.
If I read this patch correctly, bpf_iter_udp_seq_start() always starts from the
beginning of the current bucket. The bpf-iter cannot make forward progress if
the bpf_iter_udp_seq_stop is called at the middle of the bucket.
The "resume" is currently done by udp_get_idx(, *pos..) but is taken away from
bpf-udp-iter in this patch. udp_get_idx(...) starts counting from the very first
bucket and could potentially be reused here. However, I believe this patch is
easier to just resume from the current bucket. Everything is there, all it needs
is to remember its previous position/offset of the bucket.
[1] https://github.com/torvalds/linux/commit/a8b690f98baf9fb1902b8eeab801351ea603fa3a