On 5/8/20 12:07 PM, Andrii Nakryiko wrote:
On Wed, May 6, 2020 at 10:40 PM Yonghong Song <yhs@xxxxxx> wrote:
Macro DEFINE_BPF_ITER_FUNC is implemented so target
can define an init function to capture the BTF type
which represents the target.
The bpf_iter_meta is a structure holding meta data, common
to all targets in the bpf program.
Additional marker functions are called before or after
bpf_seq_read() show()/next()/stop() callback functions
to help calculate precise seq_num and whether call bpf_prog
inside stop().
Two functions, bpf_iter_get_info() and bpf_iter_run_prog(),
are implemented so target can get needed information from
bpf_iter infrastructure and can run the program.
Signed-off-by: Yonghong Song <yhs@xxxxxx>
---
include/linux/bpf.h | 11 ++++++
kernel/bpf/bpf_iter.c | 86 ++++++++++++++++++++++++++++++++++++++++---
2 files changed, 92 insertions(+), 5 deletions(-)
Looks good. I was worried about re-using seq_num when element is
skipped, but this could already happen that same seq_num is associated
with different objects: overflow + retry returns different object
(because iteration is not a snapshot, so the element could be gone on
retry). Both cases will have to be handled in about the same fashion,
so it's fine.
This is what I thought as well.
Hm... Could this be a problem for start() implementation? E.g., if
object is still there, but iterator wants to skip it permanently.
Re-using seq_num will mean that start() will keep trying to fetch same
to-be-skipped element? Not sure, please think about it, but we can fix
it up later, if necessary.
The seq_num is for bpf_program context. It does not affect how start()
behaves. The start() MAY retry the same element over and over again
if show() overflows or returns <0, but in which case, user space
should check the return error code to decide to retry or give up.
Acked-by: Andrii Nakryiko <andriin@xxxxxx>
[...]
@@ -112,11 +143,16 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size,
err = PTR_ERR(p);
break;
}
+
+ /* get a valid next object, increase seq_num */
typo: get -> got
Ack.
+ bpf_iter_inc_seq_num(seq);
+
if (seq->count >= size)
break;
err = seq->op->show(seq, p);
if (err > 0) {
+ bpf_iter_dec_seq_num(seq);
seq->count = offs;
} else if (err < 0 || seq_has_overflowed(seq)) {
seq->count = offs;
[...]