> -----Original Message----- > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma- > owner@xxxxxxxxxxxxxxx> On Behalf Of Ariel Almog > Sent: Monday, November 26, 2018 9:37 AM > To: linux-rdma@xxxxxxxxxxxxxxx; Leon Romanovsky <leon@xxxxxxxxxx>; Jason > Gunthorpe <jgg@xxxxxxxxxxxx>; Ariel Almog <ariela@xxxxxxxxxxxx> > Subject: [RFC] RDMA : performance statistics > > Hi, > > Please find below, an RFC for exposing statistics counters for RDMA > subsystem. > > The proposed tool, will be a supplementary command to already existed > rdma tool and will give user a variety of options for RDMA debug and > performance analysis > > The tool will provide out of the box global counters and in addition, user will > be able to manually tune it to monitor specific QP(s) or automatically > monitor > QP(s) according to predefined criteria > > In details > ********** > > In this RFC, the term "counter set" refers to a set of multiple counters that > builds a set of counters. For example, a counter set might include duplicate > request counter, implied nak seq err counter, etc. > > A counter set is part of namespace and cannot be viewed/set from other > namespaces. > > 1. Default, general counter > *************************** > This is a optional, default, counter set, which is allocated by the RDMA driver > upon init and is used to count all the traffic that is passing through the > device. No configuration is needed and no disablement is available > > Usage: > rdma dev stat > Shows the statistics of the the general counter set > All counters are vendor defined, like ethool and like current /sys/class/infiniband/<device>/ports/<num>/hw_stats? > 2. Manual bind/unbind of counter set to QP(s) > ********************************************* > This is an optional interface to allow user to select QPs to be monitored. > The process require the user to > a. Allocate counter_set > b. Bind QP(s) to the allocated counter_set > c. Monitor the counter_set > d. Unbind QP(s) from the allocated counter_set > e. Deallocate the counter_set > > Usage: > rdma dev stat alloc > Allocates and returns a counter set id which can be bound to QP(s) > Can you please add port number (optional) to it as most vendors support per port statistics on a multiport rdma device? > rdma dev stat dealloc <counter set id> > Deallocates a counter set id. all bounded QP(s) shall be unbind before > deallocation > > rdma dev stat bind <qp num> <counter set id> > Binds QP to a counter set > > rdma dev stat unbind <qp num> > Unbinds a QP from a counter set > > > 3. Automatic bind/unbind of counter sets to QPs > *********************************************** > This is an optional interface to allow user to build automatic sets of QPs to > counter set according to common criteria. For example a per pid scheme, > where all QPs belong to pid are bind automatically to a single counter set. > > Usage: > rdma dev stat auto pid > Allocates counter set per pid, bind all pid's QPs to this counter set > Once fork support is added, and when QP is created by the parent process and used by child process to do post send/recv, counting happens against parent process? > rdma dev stat auto type > Allocates counter set per QP type (RC, UD, Raw, etc., bind all QPs of this > type to this counter set > Can this type be multiple as type1, type2 etc in future? > rdma dev stat auto off > Deallocates all counter set that were configured as auto > When there are 1000 QPs active on a counter set of auto type, when this command is issued, what happens? Counter set is deallocated and it binds to default counter set mentioned in (1)? Or Command fails? > 4. Exposing of counters set content > *********************************** > This is an optional interface to print the counter set content. it can print all > counter sets or specific counter set according to the working mode. > > Usage: > rdma dev stat show [counter set id]|[pid]|[type] > Print statistics counters. > - In case manual mode was used, optional <counter set id> can be used > - In case auto per pid mode was used, optional <pid> can be used > - In case auto per type mode was used, optional <type> can be used > > I would like to get feedback on this proposal > > Thanks > Ariel Almog > Mellanox