On Tue, Apr 02, 2019 at 10:22:24AM +0300, Gal Pressman wrote: > On 01-Apr-19 15:19, Leon Romanovsky wrote: > > On Mon, Apr 01, 2019 at 02:59:16PM +0300, Gal Pressman wrote: > >> On 01-Apr-19 11:47, Leon Romanovsky wrote: > >>> From: Leon Romanovsky <leonro@xxxxxxxxxxxx> > >>> > >>> Hi, > >>> > >>> This series from Mark provides dynamic statistics infrastructure. > >>> He uses netlink interface to configure and retrieve those counters. > >>> > >>> This infrastructure allows to users monitor various objects by binding > >>> to them counters. As the beginning, we used QP object as target for > >>> those counters, but future patches will include ODP MR information too. > >> > >> Hi Leon and Mark, > >> Thanks for doing this! > >> > >>> > >>> Two binding modes are supported: > >>> - Auto: This allows a user to build automatic set of objects to a counter > >> > >> build = bind? > > > > "build" == "chain". In theory, user will be able to create very complex > > filters, send those chains and kernel will handle it. > > For example, bind counters for UD QP, on specific port and for new > > processes all together. > > > >> > >>> according to common criteria. For example in a per-type scheme, where in > >>> one process all QPs with same QP type are bound automatically to a single > >>> counter. > >> > >> How do we decide which criteria is suitable for auto mode and why is it better > >> than letting the userspace handle it by itself (query all QPs and bind certain > >> types to "manual" counters). > >> Seems like doing it in userspace provides more flexibility than a fixed set of > >> kernel auto types. > > > > "Auto mode" allows to get counters during object creation, for example > > in ODP MR case, it will give us a chance to count pagefaults immediately > > after MRs are created. It is good from system perspective too, he will > > need to configure policy only once during boot and it will simply work. > > I understand the motivation but this mode requires special handling for each > case of "auto mode type", and I'm not sure I understand what qualifies for > having its own type. > > This example uses the QP type, can I push a patch to auto bind all QPs that have > 2 max_recv_sge? Like all other APIs changes, we will apply common sense. > > > > >> > >> Is there a reason to have one auto counter per port? > >> Theoretically I can allocate two auto counters and assign a different auto mask > >> to each one. > > > > Sometimes you need to say enough is enough :), we didn't want to add so > > much complexity without solid use case justification. > > > > From implementation perspective, we will be able to do it later, because > > it won't require any change in kernel API. Just need to ensure that such > > masks are returned with dumpit. > > > >> > >>> - Manual: This allows a user to manually bind objects on a counter. > >>> > >>> Those two modes are mutual-exclusive with separation between processes, > >>> objects created by different processes cannot be bound to a same counter. > >>> > >>> For objects which don't support counter binding, we will return > >>> pre-allocated counters. > >> > >> Can you explain? What are those objects and what are pre allocated counters? > > > > For example MR counters, we thought to add very simple set of them and > > make always available. > > Can you please add an example output of what you have in mind? Which userspace > command triggers these pre allocated-counters? > > > > >> > >>> > >>> $ rdma statistic qp set link mlx5_2/1 auto type on > >>> $ rdma statistic qp set link mlx5_2/1 auto off > >>> $ rdma statistic qp bind link mlx5_2/1 lqpn 178 > >>> $ rdma statistic qp unbind link mlx5_2/1 cntn 4 lqpn 178 > >>> $ rdma statistic show > >>> $ rdma statistic qp mode > >>> > >>> Thanks > >>> > >>> Mark Zhang (16): > >>> net/mlx5: Add rts2rts_qp_counters_set_id field in hca cap > >>> RDMA/restrack: Introduce statistic counter > >>> RDMA/restrack: Add an API to attach a task to a resource > >>> RDMA/restrack: Make is_visible_in_pid_ns() as an API > >>> RDMA/counter: Add set/clear per-port auto mode support > >>> RDMA/counter: Add "auto" configuration mode support > >>> IB/mlx5: Support set qp counter > >>> IB/mlx5: Add counter set id as a parameter for mlx5_ib_query_q_counters() > >>> IB/mlx5: Support statistic q counter configuration > >>> RDMA/nldev: Allow counter auto mode configuration through RDMA netlink > >>> RDMA/netlink: Implement counter dumpit callback > >>> IB/mlx5: Add counter_alloc_stats() and counter_update_stats() support > >>> RDMA/core: Get sum value of all counters when perform a sysfs stat read > >>> RDMA/counter: Allow manual mode configuration support > >>> RDMA/nldev: Allow counter manual mode configuration through RDMA netlink > >>> RDMA/nldev: Allow get counter mode through RDMA netlink > >>> > >>> drivers/infiniband/core/Makefile | 2 +- > >>> drivers/infiniband/core/counters.c | 652 +++++++++++++++++++++++++++ > >>> drivers/infiniband/core/device.c | 14 + > >>> drivers/infiniband/core/nldev.c | 427 +++++++++++++++++- > >>> drivers/infiniband/core/restrack.c | 49 +- > >>> drivers/infiniband/core/restrack.h | 3 + > >>> drivers/infiniband/core/sysfs.c | 10 +- > >>> drivers/infiniband/core/verbs.c | 9 + > >>> drivers/infiniband/hw/mlx5/main.c | 88 +++- > >>> drivers/infiniband/hw/mlx5/mlx5_ib.h | 6 + > >>> drivers/infiniband/hw/mlx5/qp.c | 76 +++- > >>> include/linux/mlx5/mlx5_ifc.h | 4 +- > >>> include/linux/mlx5/qp.h | 1 + > >>> include/rdma/ib_verbs.h | 32 ++ > >>> include/rdma/rdma_counter.h | 64 +++ > >>> include/rdma/restrack.h | 4 + > >>> include/uapi/rdma/rdma_netlink.h | 52 ++- > >>> 17 files changed, 1462 insertions(+), 31 deletions(-) > >>> create mode 100644 drivers/infiniband/core/counters.c > >>> create mode 100644 include/rdma/rdma_counter.h > >>> > >>> -- > >>> 2.20.1 > >>> > >>
Attachment:
signature.asc
Description: PGP signature