Re: [RFC] RDMA : performance statistics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27-Nov-18 12:55, Ariel Almog wrote:
> 
> 
>> -----Original Message-----
>> From: Gal Pressman <galpress@xxxxxxxxxx>
>> Sent: Tuesday, November 27, 2018 11:50 AM
>> To: Ariel Almog <arielalmogworkemails@xxxxxxxxx>; linux-
>> rdma@xxxxxxxxxxxxxxx; Leon Romanovsky <leon@xxxxxxxxxx>; Jason
>> Gunthorpe <jgg@xxxxxxxxxxxx>; Ariel Almog <ariela@xxxxxxxxxxxx>
>> Subject: Re: [RFC] RDMA : performance statistics
>>
>> On 26-Nov-18 17:36, Ariel Almog wrote:
>>> Hi,
>>>
>>> Please find below, an RFC for exposing statistics counters for RDMA
>> subsystem.
>>>
>>> The proposed tool, will be a supplementary command to already existed
>>> rdma tool and will give user a variety of options for RDMA debug and
>>> performance analysis
>>>
>>> The tool will provide out of the box global counters and in addition,
>>> user will be able to manually tune it to monitor specific QP(s) or
>>> automatically monitor
>>> QP(s) according to predefined criteria
>>
>> Hi Ariel,
>> Any reason to limit monitoring to QPs only? what about other resources?
>>
> 
> Hi Gal 
> 
> Long time no talk :-) 

Indeed :)

> 
> No reason for not having other monitoring abilities under this tool. 
> If the resource is general, the interface is ready for it. 
> if the resource belong to some object, it shall be added (now or in the future) 
> 
> Which other resources did you had in mind? 

I think that having CQs/AHs/MRs/.. as well would be nice, maybe allow
counters for every resource that is tracked by the rdma resource tracker.
Will need to make adjustments to the "auto" mode to fit each one.

Can we extend the interface to support vendor specific resources as
well? what happens with interesting counters that we can't bind to a
specific resource?

> 
>>>
>>> In details
>>> **********
>>>
>>> In this RFC, the term "counter set" refers to a set of multiple
>>> counters that builds a set of counters. For example, a counter set
>>> might include duplicate request counter, implied nak seq err counter, etc.
>>>
>>> A counter set is part of namespace and cannot be viewed/set from other
>>> namespaces.
>>>
>>> 1. Default, general counter
>>> ***************************
>>> This is a optional, default, counter set, which is allocated by the
>>> RDMA driver upon init and is used to count all the traffic that is
>>> passing through the device. No configuration is needed and no
>>> disablement is available
>>>
>>> Usage:
>>> rdma dev stat
>>>      Shows the statistics of the the general counter set
>>>
>>> 2. Manual bind/unbind of counter set to QP(s)
>>> *********************************************
>>> This is an optional interface to allow user to select QPs to be monitored.
>>> The process require the user to
>>>   a. Allocate counter_set
>>>   b. Bind QP(s) to the allocated counter_set
>>>   c. Monitor the counter_set
>>>   d. Unbind QP(s) from the allocated counter_set
>>>   e. Deallocate the counter_set
>>>
>>> Usage:
>>> rdma dev stat alloc
>>>      Allocates and returns a counter set id which can be bound to
>>> QP(s)
>>>
>>> rdma dev stat dealloc <counter set id>
>>>      Deallocates a counter set id. all bounded QP(s) shall be unbind before
>>>      deallocation
>>>
>>> rdma dev stat bind <qp num> <counter set id>
>>>      Binds QP to a counter set
>>>
>>> rdma dev stat unbind <qp num>
>>>      Unbinds a QP from a counter set
>>>
>>>
>>> 3. Automatic bind/unbind of counter sets to QPs
>>> ***********************************************
>>> This is an optional interface to allow user to build automatic sets of
>>> QPs to counter set according to common criteria. For example a per pid
>>> scheme, where all QPs belong to pid are bind automatically to a single
>> counter set.
>>>
>>> Usage:
>>> rdma dev stat auto pid
>>>      Allocates counter set per pid, bind all pid's QPs to this counter
>>> set
>>>
>>> rdma dev stat auto type
>>>      Allocates counter set per QP type (RC, UD, Raw, etc., bind all QPs of this
>>>      type to this counter set
>>>
>>> rdma dev stat auto off
>>>      Deallocates all counter set that were configured as auto
>>>
>>> 4. Exposing of counters set content
>>> ***********************************
>>> This is an optional interface to print the counter set content. it can
>>> print all counter sets or specific counter set according to the working mode.
>>>
>>> Usage:
>>> rdma dev stat show [counter set id]|[pid]|[type]
>>>      Print statistics counters.
>>>      - In case manual mode was used, optional <counter set id> can be used
>>>      - In case auto per pid mode was used, optional <pid> can be used
>>>      - In case auto per type mode was used, optional <type> can be
>>> used
>>>
>>> I would like to get feedback on this proposal
>>>
>>> Thanks
>>> Ariel Almog
>>> Mellanox
>>>
> 




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux