Hi Ariel, > -----Original Message----- > From: Ariel Almog > Sent: Tuesday, November 27, 2018 4:59 AM > To: Parav Pandit <parav@xxxxxxxxxxxx>; Ariel Almog > <arielalmogworkemails@xxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Leon > Romanovsky <leon@xxxxxxxxxx>; Jason Gunthorpe <jgg@xxxxxxxxxxxx> > Subject: RE: [RFC] RDMA : performance statistics > > > > > -----Original Message----- > > From: Parav Pandit > > Sent: Tuesday, November 27, 2018 5:40 AM > > To: Ariel Almog <arielalmogworkemails@xxxxxxxxx>; linux- > > rdma@xxxxxxxxxxxxxxx; Leon Romanovsky <leon@xxxxxxxxxx>; Jason > > Gunthorpe <jgg@xxxxxxxxxxxx>; Ariel Almog <ariela@xxxxxxxxxxxx> > > Subject: RE: [RFC] RDMA : performance statistics > > > > > > > > > -----Original Message----- > > > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma- > > > owner@xxxxxxxxxxxxxxx> On Behalf Of Ariel Almog > > > Sent: Monday, November 26, 2018 9:37 AM > > > To: linux-rdma@xxxxxxxxxxxxxxx; Leon Romanovsky <leon@xxxxxxxxxx>; > > > Jason Gunthorpe <jgg@xxxxxxxxxxxx>; Ariel Almog > > <ariela@xxxxxxxxxxxx> > > > Subject: [RFC] RDMA : performance statistics > > > > > > Hi, > > > > > > Please find below, an RFC for exposing statistics counters for RDMA > > > subsystem. > > > > > > The proposed tool, will be a supplementary command to already > > > existed rdma tool and will give user a variety of options for RDMA > > > debug and performance analysis > > > > > > The tool will provide out of the box global counters and in > > > addition, user will be able to manually tune it to monitor specific > > > QP(s) or automatically monitor > > > QP(s) according to predefined criteria > > > > > > In details > > > ********** > > > > > > In this RFC, the term "counter set" refers to a set of multiple > > > counters that builds a set of counters. For example, a counter set > > > might include duplicate request counter, implied nak seq err counter, > etc. > > > > > > A counter set is part of namespace and cannot be viewed/set from > > > other namespaces. > > > > > > 1. Default, general counter > > > *************************** > > > This is a optional, default, counter set, which is allocated by the > > > RDMA driver upon init and is used to count all the traffic that is > > > passing through the device. No configuration is needed and no > > > disablement is available > > > > > > Usage: > > > rdma dev stat > > > Shows the statistics of the the general counter set > > > > > All counters are vendor defined, like ethool and like current > > /sys/class/infiniband/<device>/ports/<num>/hw_stats? > > Since the type of counters and HW support vary between devices, I think > that this approach will be easier to adopt > Yes. thanks. > > > 2. Manual bind/unbind of counter set to QP(s) > > > ********************************************* > > > This is an optional interface to allow user to select QPs to be monitored. > > > The process require the user to > > > a. Allocate counter_set > > > b. Bind QP(s) to the allocated counter_set > > > c. Monitor the counter_set > > > d. Unbind QP(s) from the allocated counter_set > > > e. Deallocate the counter_set > > > > > > Usage: > > > rdma dev stat alloc > > > Allocates and returns a counter set id which can be bound to > > > QP(s) > > > > > Can you please add port number (optional) to it as most vendors > > support per port statistics on a multiport rdma device? > > Sure, will be added > > > > rdma dev stat dealloc <counter set id> > > > Deallocates a counter set id. all bounded QP(s) shall be unbind before > > > deallocation > > > > > > rdma dev stat bind <qp num> <counter set id> > > > Binds QP to a counter set > > > > > > rdma dev stat unbind <qp num> > > > Unbinds a QP from a counter set > > > > > > > > > 3. Automatic bind/unbind of counter sets to QPs > > > *********************************************** > > > This is an optional interface to allow user to build automatic sets > > > of QPs to counter set according to common criteria. For example a > > > per pid scheme, where all QPs belong to pid are bind automatically > > > to a single > > counter set. > > > > > > Usage: > > > rdma dev stat auto pid > > > Allocates counter set per pid, bind all pid's QPs to this > > > counter set > > > > > Once fork support is added, and when QP is created by the parent > > process and used by child process to do post send/recv, counting > > happens against parent process? > > > > > rdma dev stat auto type > > > Allocates counter set per QP type (RC, UD, Raw, etc., bind all QPs of > this > > > type to this counter set > > > > > Can this type be multiple as type1, type2 etc in future? > > > > > rdma dev stat auto off > > > Deallocates all counter set that were configured as auto > > > > > When there are 1000 QPs active on a counter set of auto type, when > > this command is issued, what happens? > > Counter set is deallocated and it binds to default counter set > > mentioned in (1)? > > Or > > Command fails? > > > > > 4. Exposing of counters set content > > > *********************************** > > > This is an optional interface to print the counter set content. it > > > can print all counter sets or specific counter set according to the working > mode. > > > > > > Usage: > > > rdma dev stat show [counter set id]|[pid]|[type] > > > Print statistics counters. > > > - In case manual mode was used, optional <counter set id> can be > used > > > - In case auto per pid mode was used, optional <pid> can be used > > > - In case auto per type mode was used, optional <type> can be > > > used > > > > > > I would like to get feedback on this proposal > > > > > > Thanks > > > Ariel Almog > > > Mellanox Also, now that we are diverging from moving towards netlink from query_device, where query_device has provided max_mr, max_qp etc max values, We need command rdma stats dev [DEV] show max_counter_sets This allows application and use to decide based on how should use this, depending on how many counterset supported for a device.