> -----Original Message----- > From: Gal Pressman <galpress@xxxxxxxxxx> > Sent: Tuesday, November 27, 2018 2:47 PM > To: Ariel Almog <ariela@xxxxxxxxxxxx>; Gal Pressman > <galpress@xxxxxxxxxx>; Ariel Almog <arielalmogworkemails@xxxxxxxxx>; > linux-rdma@xxxxxxxxxxxxxxx; Leon Romanovsky <leon@xxxxxxxxxx>; Jason > Gunthorpe <jgg@xxxxxxxxxxxx> > Subject: Re: [RFC] RDMA : performance statistics > > On 27-Nov-18 12:55, Ariel Almog wrote: > > > > > >> -----Original Message----- > >> From: Gal Pressman <galpress@xxxxxxxxxx> > >> Sent: Tuesday, November 27, 2018 11:50 AM > >> To: Ariel Almog <arielalmogworkemails@xxxxxxxxx>; linux- > >> rdma@xxxxxxxxxxxxxxx; Leon Romanovsky <leon@xxxxxxxxxx>; Jason > >> Gunthorpe <jgg@xxxxxxxxxxxx>; Ariel Almog <ariela@xxxxxxxxxxxx> > >> Subject: Re: [RFC] RDMA : performance statistics > >> > >> On 26-Nov-18 17:36, Ariel Almog wrote: > >>> Hi, > >>> > >>> Please find below, an RFC for exposing statistics counters for RDMA > >> subsystem. > >>> > >>> The proposed tool, will be a supplementary command to already > >>> existed rdma tool and will give user a variety of options for RDMA > >>> debug and performance analysis > >>> > >>> The tool will provide out of the box global counters and in > >>> addition, user will be able to manually tune it to monitor specific > >>> QP(s) or automatically monitor > >>> QP(s) according to predefined criteria > >> > >> Hi Ariel, > >> Any reason to limit monitoring to QPs only? what about other resources? > >> > > > > Hi Gal > > > > Long time no talk :-) > > Indeed :) > > > > > No reason for not having other monitoring abilities under this tool. > > If the resource is general, the interface is ready for it. > > if the resource belong to some object, it shall be added (now or in > > the future) > > > > Which other resources did you had in mind? > > I think that having CQs/AHs/MRs/.. as well would be nice, maybe allow > counters for every resource that is tracked by the rdma resource tracker. > Will need to make adjustments to the "auto" mode to fit each one. Agreed > Can we extend the interface to support vendor specific resources as well? Vendor specific resources which doesn't fall under QPs/CQs/AHs/MRs/ global categories? I feel that allowing vendor specific counters under these categories are a must. Having a vendor specific category requires justification > what happens with interesting counters that we can't bind to a specific > resource? If it is global resource I see no issue with presenting it > > > >>> > >>> In details > >>> ********** > >>> > >>> In this RFC, the term "counter set" refers to a set of multiple > >>> counters that builds a set of counters. For example, a counter set > >>> might include duplicate request counter, implied nak seq err counter, > etc. > >>> > >>> A counter set is part of namespace and cannot be viewed/set from > >>> other namespaces. > >>> > >>> 1. Default, general counter > >>> *************************** > >>> This is a optional, default, counter set, which is allocated by the > >>> RDMA driver upon init and is used to count all the traffic that is > >>> passing through the device. No configuration is needed and no > >>> disablement is available > >>> > >>> Usage: > >>> rdma dev stat > >>> Shows the statistics of the the general counter set > >>> > >>> 2. Manual bind/unbind of counter set to QP(s) > >>> ********************************************* > >>> This is an optional interface to allow user to select QPs to be monitored. > >>> The process require the user to > >>> a. Allocate counter_set > >>> b. Bind QP(s) to the allocated counter_set > >>> c. Monitor the counter_set > >>> d. Unbind QP(s) from the allocated counter_set > >>> e. Deallocate the counter_set > >>> > >>> Usage: > >>> rdma dev stat alloc > >>> Allocates and returns a counter set id which can be bound to > >>> QP(s) > >>> > >>> rdma dev stat dealloc <counter set id> > >>> Deallocates a counter set id. all bounded QP(s) shall be unbind before > >>> deallocation > >>> > >>> rdma dev stat bind <qp num> <counter set id> > >>> Binds QP to a counter set > >>> > >>> rdma dev stat unbind <qp num> > >>> Unbinds a QP from a counter set > >>> > >>> > >>> 3. Automatic bind/unbind of counter sets to QPs > >>> *********************************************** > >>> This is an optional interface to allow user to build automatic sets > >>> of QPs to counter set according to common criteria. For example a > >>> per pid scheme, where all QPs belong to pid are bind automatically > >>> to a single > >> counter set. > >>> > >>> Usage: > >>> rdma dev stat auto pid > >>> Allocates counter set per pid, bind all pid's QPs to this > >>> counter set > >>> > >>> rdma dev stat auto type > >>> Allocates counter set per QP type (RC, UD, Raw, etc., bind all QPs of > this > >>> type to this counter set > >>> > >>> rdma dev stat auto off > >>> Deallocates all counter set that were configured as auto > >>> > >>> 4. Exposing of counters set content > >>> *********************************** > >>> This is an optional interface to print the counter set content. it > >>> can print all counter sets or specific counter set according to the working > mode. > >>> > >>> Usage: > >>> rdma dev stat show [counter set id]|[pid]|[type] > >>> Print statistics counters. > >>> - In case manual mode was used, optional <counter set id> can be > used > >>> - In case auto per pid mode was used, optional <pid> can be used > >>> - In case auto per type mode was used, optional <type> can be > >>> used > >>> > >>> I would like to get feedback on this proposal > >>> > >>> Thanks > >>> Ariel Almog > >>> Mellanox > >>> > >