Re: [PATCH bpf-next 2/5] bpf: add main_thread_only customization for task/task_file iterators

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 26, 2020 at 5:07 PM Yonghong Song <yhs@xxxxxx> wrote:
>
> Currently, task and task_file by default iterates through
> all tasks. For task_file, by default, all files from all tasks
> will be traversed.
>
> But for a user process, the file_table is shared by all threads
> of that process. So traversing the main thread per process should
> be enough to traverse all files and this can save a lot of cpu
> time if some process has large number of threads and each thread
> has lots of open files.
>
> This patch implemented a customization for task/task_file iterator,
> permitting to traverse only the kernel task where its pid equal
> to tgid in the kernel. This includes some kernel threads, and
> main threads of user processes. This will solve the above potential
> performance issue for task_file. This customization may be useful
> for task iterator too if only traversing main threads is enough.
>
> Signed-off-by: Yonghong Song <yhs@xxxxxx>
> ---
>  include/linux/bpf.h            |  3 ++-
>  include/uapi/linux/bpf.h       |  5 ++++
>  kernel/bpf/task_iter.c         | 46 +++++++++++++++++++++++-----------
>  tools/include/uapi/linux/bpf.h |  5 ++++
>  4 files changed, 43 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index a6131d95e31e..058eb9b0ba78 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1220,7 +1220,8 @@ int bpf_obj_get_user(const char __user *pathname, int flags);
>         int __init bpf_iter_ ## target(args) { return 0; }
>
>  struct bpf_iter_aux_info {
> -       struct bpf_map *map;
> +       struct bpf_map *map;    /* for iterator traversing map elements */
> +       bool main_thread_only;  /* for task/task_file iterator */

As a user of task_file iterator I'd hate to make this decision,
honestly, especially if I can't prove that all processes share the
same file table (I think clone() syscall allows to do weird
combinations like that, right?). It does make sense for a task/
iterator, though, if I need to iterate a user-space process (group of
tasks). So can we do this:

1a. Either add a new bpf_iter type process/ (or in kernel lingo
task_group/) to iterate only main threads (group_leader in kernel
lingo);
1b. Or make this main_thread_only an option for only a task/ iterator
(and maybe call it "group_leader_only" or something to reflect kernel
terminology?)

2. For task/file iterator, still iterate all tasks, but if the task's
files struct is the same as group_leader's files struct, then go to
the next one. This should eliminate tons of duplication of iterating
the same files over and over. It would still iterate a bunch of tasks,
but compared to the number of files that's generally a much smaller
number, so should still give sizable savings. I don't think we need an
extra option for this, tbh, this behavior was my intuitive
expectation, so I don't think you'll be breaking any sane user of this
iterator.

Disclaimer: I haven't got a chance to look through kernel code much,
so I'm sorry if what I'm proposing is something that is impossible to
implement or extremely hard/unreliable. But thought I'll put this idea
out there before we decide on this.

WDYT?

[...]



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux