Re: [PATCH bpf-next 1/3] bpf: Parameterize task iterators.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2022-07-28 at 19:00 +0200, Kumar Kartikeya Dwivedi wrote:
> On Thu, 28 Jul 2022 at 18:40, Kui-Feng Lee <kuifeng@xxxxxx> wrote:
> > 
> > On Thu, 2022-07-28 at 18:22 +0200, Kumar Kartikeya Dwivedi wrote:
> > > On Thu, 28 Jul 2022 at 17:16, Kui-Feng Lee <kuifeng@xxxxxx>
> > > wrote:
> > > > 
> > > > On Thu, 2022-07-28 at 10:47 +0200, Kumar Kartikeya Dwivedi
> > > > wrote:
> > > > > On Thu, 28 Jul 2022 at 07:25, Kui-Feng Lee <kuifeng@xxxxxx>
> > > > > wrote:
> > > > > > 
> > > > > > On Wed, 2022-07-27 at 10:19 +0200, Kumar Kartikeya Dwivedi
> > > > > > wrote:
> > > > > > > On Wed, 27 Jul 2022 at 09:01, Kui-Feng Lee
> > > > > > > <kuifeng@xxxxxx>
> > > > > > > wrote:
> > > > > > > > 
> > > > > > > > On Tue, 2022-07-26 at 14:13 +0200, Jiri Olsa wrote:
> > > > > > > > > On Mon, Jul 25, 2022 at 10:17:11PM -0700, Kui-Feng
> > > > > > > > > Lee
> > > > > > > > > wrote:
> > > > > > > > > > Allow creating an iterator that loops through
> > > > > > > > > > resources
> > > > > > > > > > of
> > > > > > > > > > one
> > > > > > > > > > task/thread.
> > > > > > > > > > 
> > > > > > > > > > People could only create iterators to loop through
> > > > > > > > > > all
> > > > > > > > > > resources of
> > > > > > > > > > files, vma, and tasks in the system, even though
> > > > > > > > > > they
> > > > > > > > > > were
> > > > > > > > > > interested
> > > > > > > > > > in only the resources of a specific task or
> > > > > > > > > > process.
> > > > > > > > > > Passing
> > > > > > > > > > the
> > > > > > > > > > additional parameters, people can now create an
> > > > > > > > > > iterator to
> > > > > > > > > > go
> > > > > > > > > > through all resources or only the resources of a
> > > > > > > > > > task.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Kui-Feng Lee <kuifeng@xxxxxx>
> > > > > > > > > > ---
> > > > > > > > > >   include/linux/bpf.h            |  4 ++
> > > > > > > > > >   include/uapi/linux/bpf.h       | 23 ++++++++++
> > > > > > > > > >   kernel/bpf/task_iter.c         | 81
> > > > > > > > > > +++++++++++++++++++++++++-
> > > > > > > > > > ----
> > > > > > > > > > ----
> > > > > > > > > >   tools/include/uapi/linux/bpf.h | 23 ++++++++++
> > > > > > > > > >   4 files changed, 109 insertions(+), 22
> > > > > > > > > > deletions(-)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/include/linux/bpf.h
> > > > > > > > > > b/include/linux/bpf.h
> > > > > > > > > > index 11950029284f..c8d164404e20 100644
> > > > > > > > > > --- a/include/linux/bpf.h
> > > > > > > > > > +++ b/include/linux/bpf.h
> > > > > > > > > > @@ -1718,6 +1718,10 @@ int bpf_obj_get_user(const
> > > > > > > > > > char
> > > > > > > > > > __user
> > > > > > > > > > *pathname, int flags);
> > > > > > > > > > 
> > > > > > > > > >   struct bpf_iter_aux_info {
> > > > > > > > > >          struct bpf_map *map;
> > > > > > > > > > +       struct {
> > > > > > > > > > +               __u32   tid;
> > > > > > > > > 
> > > > > > > > > should be just u32 ?
> > > > > > > > 
> > > > > > > > Or, should change the following 'type' to __u8?
> > > > > > > 
> > > > > > > Would it be better to use a pidfd instead of a tid here?
> > > > > > > Unset
> > > > > > > pidfd
> > > > > > > would mean going over all tasks, and any fd > 0 implies
> > > > > > > attaching
> > > > > > > to
> > > > > > > a
> > > > > > > specific task (as is the convention in BPF land). Most of
> > > > > > > the
> > > > > > > new
> > > > > > > UAPIs working on processes are using pidfds (to work with
> > > > > > > a
> > > > > > > stable
> > > > > > > handle instead of a reusable ID).
> > > > > > > The iterator taking an fd also gives an opportunity to
> > > > > > > BPF
> > > > > > > LSMs
> > > > > > > to
> > > > > > > attach permissions/policies to it (once we have a file
> > > > > > > local
> > > > > > > storage
> > > > > > > map) e.g. whether creating a task iterator for that
> > > > > > > specific
> > > > > > > pidfd
> > > > > > > instance (backed by the struct file) would be allowed or
> > > > > > > not.
> > > > > > > You are using getpid in the selftest and keeping track of
> > > > > > > last_tgid
> > > > > > > in
> > > > > > > the iterator, so I guess you don't even need to extend
> > > > > > > pidfd_open
> > > > > > > to
> > > > > > > work on thread IDs right now for your use case (and
> > > > > > > fdtable
> > > > > > > and
> > > > > > > mm
> > > > > > > are
> > > > > > > shared for POSIX threads anyway, so for those two it
> > > > > > > won't
> > > > > > > make a
> > > > > > > difference).
> > > > > > > 
> > > > > > > What is your opinion?
> > > > > > 
> > > > > > Do you mean removed both tid and type, and replace them
> > > > > > with a
> > > > > > pidfd?
> > > > > > We can do that in uapi, struct bpf_link_info.  But, the
> > > > > > interal
> > > > > > types,
> > > > > > ex. bpf_iter_aux_info, still need to use tid or struct file
> > > > > > to
> > > > > > avoid
> > > > > > getting file from the per-process fdtable.  Is that what
> > > > > > you
> > > > > > mean?
> > > > > > 
> > > > > 
> > > > > Yes, just for the UAPI, it is similar to taking map_fd for
> > > > > map
> > > > > iter.
> > > > > In bpf_link_info we should report just the tid, just like map
> > > > > iter
> > > > > reports map_id.
> > > > 
> > > > It sounds good to me.
> > > > 
> > > > One thing I need a clarification. You mentioned that a fd > 0
> > > > implies
> > > > attaching to a specific task, however fd can be 0. So, it
> > > > should be
> > > > fd
> > > > > = 0. So, it forces the user to initialize the value of pidfd
> > > > > to -
> > > > > 1.
> > > > So, for convenience, we still need a field like 'type' to make
> > > > it
> > > > easy
> > > > to create iterators without a filter.
> > > > 
> > > 
> > > Right, but in lots of BPF UAPI fields, fd 0 means fd is unset, so
> > > it
> > > is fine to rely on that assumption. For e.g. even for map_fd,
> > > bpf_map_elem iterator considers fd 0 to be unset. Then you don't
> > > need
> > > the type field.
> > 
> > I just realize that pidfd may be meaningless for the bpf_link_info
> > returned by bpf_obj_get_info_by_fd() since the origin fd might be
> > closed already.  So, I will always set it a value of 0.
> > 
> 
> For bpf_link_info, we should only be returning the tid of the task it
> is attached to, you cannot report the pidfd in bpf_link_info
> correctly (as you already realised). By default this would be 0,
> which is also an invalid tid, but when pidfd is set it will be the
> tid of the task it is attached to, so it works well.


We have a lot of dicussions around using tid or pidfd?
Kumar also mentioned about removing 'type'.
However, I have a feel that we need to keep 'type' in struct
bpf_link_info.  I cam imagine that we may like to create iterators of
tasks in a cgroup or other paramters in futhure.  'type' will help us
to tell the types of a parameter.






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux