RE: [RFC] RDMA : performance statistics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Gal Pressman <galpress@xxxxxxxxxx>
> Sent: Tuesday, November 27, 2018 2:47 PM
> To: Ariel Almog <ariela@xxxxxxxxxxxx>; Gal Pressman
> <galpress@xxxxxxxxxx>; Ariel Almog <arielalmogworkemails@xxxxxxxxx>;
> linux-rdma@xxxxxxxxxxxxxxx; Leon Romanovsky <leon@xxxxxxxxxx>; Jason
> Gunthorpe <jgg@xxxxxxxxxxxx>
> Subject: Re: [RFC] RDMA : performance statistics
> 
> On 27-Nov-18 12:55, Ariel Almog wrote:
> >
> >
> >> -----Original Message-----
> >> From: Gal Pressman <galpress@xxxxxxxxxx>
> >> Sent: Tuesday, November 27, 2018 11:50 AM
> >> To: Ariel Almog <arielalmogworkemails@xxxxxxxxx>; linux-
> >> rdma@xxxxxxxxxxxxxxx; Leon Romanovsky <leon@xxxxxxxxxx>; Jason
> >> Gunthorpe <jgg@xxxxxxxxxxxx>; Ariel Almog <ariela@xxxxxxxxxxxx>
> >> Subject: Re: [RFC] RDMA : performance statistics
> >>
> >> On 26-Nov-18 17:36, Ariel Almog wrote:
> >>> Hi,
> >>>
> >>> Please find below, an RFC for exposing statistics counters for RDMA
> >> subsystem.
> >>>
> >>> The proposed tool, will be a supplementary command to already
> >>> existed rdma tool and will give user a variety of options for RDMA
> >>> debug and performance analysis
> >>>
> >>> The tool will provide out of the box global counters and in
> >>> addition, user will be able to manually tune it to monitor specific
> >>> QP(s) or automatically monitor
> >>> QP(s) according to predefined criteria
> >>
> >> Hi Ariel,
> >> Any reason to limit monitoring to QPs only? what about other resources?
> >>
> >
> > Hi Gal
> >
> > Long time no talk :-)
> 
> Indeed :)
> 
> >
> > No reason for not having other monitoring abilities under this tool.
> > If the resource is general, the interface is ready for it.
> > if the resource belong to some object, it shall be added (now or in
> > the future)
> >
> > Which other resources did you had in mind?
> 
> I think that having CQs/AHs/MRs/.. as well would be nice, maybe allow
> counters for every resource that is tracked by the rdma resource tracker.
> Will need to make adjustments to the "auto" mode to fit each one.
 
Agreed 

> Can we extend the interface to support vendor specific resources as well?

Vendor specific resources which doesn't fall under QPs/CQs/AHs/MRs/
global categories? 
I feel that allowing vendor specific counters under these categories are a 
must. Having a vendor specific category requires justification 

> what happens with interesting counters that we can't bind to a specific
> resource?

If it is global resource I see no issue with presenting it 

> >
> >>>
> >>> In details
> >>> **********
> >>>
> >>> In this RFC, the term "counter set" refers to a set of multiple
> >>> counters that builds a set of counters. For example, a counter set
> >>> might include duplicate request counter, implied nak seq err counter,
> etc.
> >>>
> >>> A counter set is part of namespace and cannot be viewed/set from
> >>> other namespaces.
> >>>
> >>> 1. Default, general counter
> >>> ***************************
> >>> This is a optional, default, counter set, which is allocated by the
> >>> RDMA driver upon init and is used to count all the traffic that is
> >>> passing through the device. No configuration is needed and no
> >>> disablement is available
> >>>
> >>> Usage:
> >>> rdma dev stat
> >>>      Shows the statistics of the the general counter set
> >>>
> >>> 2. Manual bind/unbind of counter set to QP(s)
> >>> *********************************************
> >>> This is an optional interface to allow user to select QPs to be monitored.
> >>> The process require the user to
> >>>   a. Allocate counter_set
> >>>   b. Bind QP(s) to the allocated counter_set
> >>>   c. Monitor the counter_set
> >>>   d. Unbind QP(s) from the allocated counter_set
> >>>   e. Deallocate the counter_set
> >>>
> >>> Usage:
> >>> rdma dev stat alloc
> >>>      Allocates and returns a counter set id which can be bound to
> >>> QP(s)
> >>>
> >>> rdma dev stat dealloc <counter set id>
> >>>      Deallocates a counter set id. all bounded QP(s) shall be unbind before
> >>>      deallocation
> >>>
> >>> rdma dev stat bind <qp num> <counter set id>
> >>>      Binds QP to a counter set
> >>>
> >>> rdma dev stat unbind <qp num>
> >>>      Unbinds a QP from a counter set
> >>>
> >>>
> >>> 3. Automatic bind/unbind of counter sets to QPs
> >>> ***********************************************
> >>> This is an optional interface to allow user to build automatic sets
> >>> of QPs to counter set according to common criteria. For example a
> >>> per pid scheme, where all QPs belong to pid are bind automatically
> >>> to a single
> >> counter set.
> >>>
> >>> Usage:
> >>> rdma dev stat auto pid
> >>>      Allocates counter set per pid, bind all pid's QPs to this
> >>> counter set
> >>>
> >>> rdma dev stat auto type
> >>>      Allocates counter set per QP type (RC, UD, Raw, etc., bind all QPs of
> this
> >>>      type to this counter set
> >>>
> >>> rdma dev stat auto off
> >>>      Deallocates all counter set that were configured as auto
> >>>
> >>> 4. Exposing of counters set content
> >>> ***********************************
> >>> This is an optional interface to print the counter set content. it
> >>> can print all counter sets or specific counter set according to the working
> mode.
> >>>
> >>> Usage:
> >>> rdma dev stat show [counter set id]|[pid]|[type]
> >>>      Print statistics counters.
> >>>      - In case manual mode was used, optional <counter set id> can be
> used
> >>>      - In case auto per pid mode was used, optional <pid> can be used
> >>>      - In case auto per type mode was used, optional <type> can be
> >>> used
> >>>
> >>> I would like to get feedback on this proposal
> >>>
> >>> Thanks
> >>> Ariel Almog
> >>> Mellanox
> >>>
> >





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux