On Mon, Nov 26, 2018 at 05:36:45PM +0200, Ariel Almog wrote: > Hi, > > Please find below, an RFC for exposing statistics counters for RDMA subsystem. > > The proposed tool, will be a supplementary command to already existed > rdma tool and will give user a variety of options for RDMA debug and > performance analysis > > The tool will provide out of the box global counters and in addition, user will > be able to manually tune it to monitor specific QP(s) or automatically monitor > QP(s) according to predefined criteria > > In details > ********** > > In this RFC, the term "counter set" refers to a set of multiple counters > that builds a set of counters. For example, a counter set might include > duplicate request counter, implied nak seq err counter, etc. > > A counter set is part of namespace and cannot be viewed/set from other > namespaces. > > 1. Default, general counter > *************************** > This is a optional, default, counter set, which is allocated by the RDMA > driver upon init and is used to count all the traffic that is passing through > the device. No configuration is needed and no disablement is available > > Usage: > rdma dev stat > Shows the statistics of the the general counter set Ariel, There is a need to see the status of those counter sets, like allocated but not bind yet e.t.c. Also, there is a need to see utilization of such counters because they are HW resources. So actually, statistics deserves its own subsection, e.g. "rdma statistics ..." and not "rdma dev statistics ...". > > 2. Manual bind/unbind of counter set to QP(s) > ********************************************* > This is an optional interface to allow user to select QPs to be monitored. > The process require the user to > a. Allocate counter_set > b. Bind QP(s) to the allocated counter_set > c. Monitor the counter_set > d. Unbind QP(s) from the allocated counter_set > e. Deallocate the counter_set > > Usage: > rdma dev stat alloc > Allocates and returns a counter set id which can be bound to QP(s) > > rdma dev stat dealloc <counter set id> > Deallocates a counter set id. all bounded QP(s) shall be unbind before > deallocation > > rdma dev stat bind <qp num> <counter set id> > Binds QP to a counter set > > rdma dev stat unbind <qp num> > Unbinds a QP from a counter set > > > 3. Automatic bind/unbind of counter sets to QPs > *********************************************** > This is an optional interface to allow user to build automatic sets of QPs to > counter set according to common criteria. For example a per pid scheme, where > all QPs belong to pid are bind automatically to a single counter set. > > Usage: > rdma dev stat auto pid > Allocates counter set per pid, bind all pid's QPs to this counter set > > rdma dev stat auto type > Allocates counter set per QP type (RC, UD, Raw, etc., bind all QPs of this > type to this counter set > > rdma dev stat auto off > Deallocates all counter set that were configured as auto If you will use "rdma stat ..." mode, it will allow you to model in the same way to "rdma res .." -> based on objects with specific properties to those specific objects. > > 4. Exposing of counters set content > *********************************** > This is an optional interface to print the counter set content. it can print > all counter sets or specific counter set according to the working mode. > > Usage: > rdma dev stat show [counter set id]|[pid]|[type] > Print statistics counters. > - In case manual mode was used, optional <counter set id> can be used > - In case auto per pid mode was used, optional <pid> can be used > - In case auto per type mode was used, optional <type> can be used > > I would like to get feedback on this proposal > > Thanks > Ariel Almog > Mellanox
Attachment:
signature.asc
Description: PGP signature