[RFC v1 for verbs counters]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



RFC: Verbs Counters


There is a constant demand to know about connections used in verbs (and and other aspects).
Some vendors have been offering hardware counters for a long time by using sysfs. 
Those counters, however - are not available per connection but for the whole system.

One way to do it is for each vendor to offer their vendor specific counters, which will probably
 not be generic since each vendor could have their own implementation of counters hence
 the verbs interface will not be generic for the rest of them. 


We present a generic interface for using counters in verbs.
Let's have some definition before going into details:

Object:          an existing structure in verbs which describes physical entity
                                 e.g.: QP/FLOW/DEVICE/
Counter:         a single attribute which is use to count events/statistics
                                 on object
Counter-set:     a set of counters that belongs to one specific object.


A generic interface with  following functionality is presented :


1.      A way to list of all the available counter-sets in the device.
        Per each counter-set:
        - What do the counters within the set measure? is it QP? Flow? other?
        - A Unique identifier per counter-set.
        - A list of names for all the counters within each counter-set.
           since each vendor has their own counters/stats. Each vendor 
           could use their own names for a counter. This suggestion 
           aims to replace vendor-specific-api with predefined 
           enums/names for each counter/stats.
        - Additional Meta-data about a counter-set (for example - is it cached?)
        
2.      Operations available per counter-set:
2.1     Bind and unbind:
        a counter-set has to be attached to an object in order for any
        counter within a counter-set to count. the attaching action
        is referred to as 'bind' and the opposite action is referred to
        as 'unbind'.
        Rather than having specific generic operation for bind and unbind 
        I choose to use existing verbs methods. The existing methods could 
        be modified with small changes (like adding a new flag) to bind 
        (or unbind) a counter-set to an object.


2.2     Counter-set may be created: counter-set instance is allocated and 
          created on a ibv context and belongs to that context.

2.3     Counter-set may be destroyed: counter-set instance is destroyed and 
        de-allocated. If counter-set is bonded to object then it is the 
        responsibility of the driver either to unbind prior to  hardware 
        de-allocation or to notify the user that driver is unable to destroy 
        a counter-set and it is the user responsibility to unbind prior to 
        destruction.

2.4     Counter-set may be queried:  the user supply counter-set instance and 
        output address. The hardware queries the counter-set and writes the 
        output to the  address as an array of uint64_t. Each entry in the 
        uint64_t array represents a single counter. 


The user is expected to query the device on startup, 
   find which counter-sets are supported and to which objects
   each counter-set may be bonded. During this scan the user
    also finds out which counters are supported for which object.
        
Example for a way to list of all the available counter-sets in the device.


We modify the method  int query_device_ex() by adding a new flag to the 
enum ibv_device_attr_mask:


+          IBV_DEVICE_ATTR_COUNTER_SET        = 1 <<  1


When using this flag, the device will response with struct ibv_device_attr_ex
with a new attribute:


+          uint64_t      max_supported_counter_sets;


And then a user can use a new API to get the description for each counter-set.
number of counter set is specified by a counter-set-id.
a counter-set id is a number from 0 to max_supported_counter_sets.
that is - the number returned from the query_device_ex() call.


int ibv_query_counter_set_description(struct ibv_context *context, \
                                      uint64_t counter_set_id, \
                                      struct ibv_counter_set_description * out)
- return 0 on success
- return -1 when counter_set_id is invalid.


The API writes to out the following structure:


struct ibv_counter_set_description {
            // Which type does this set refers to?                                              
            // value is taken from enum ibv_counter_set_counted_type  
                        uint8_t            counted_type;                                     
            // Number of instances of this counter-set available in the hardware
                        uint64_t           number_of_counter_sets;                   
            // Attributes of the set (bit mask)
            // value is taken from enum ibv_counter_set_attributes
                        uint32_t           attributes;                                                                    
            // number of entries
                        uint8_t              entries_count;
            // List of entries, 
                        struct ibv_counter_entry  entry[256];              
            }


Where:
struct ibv_counter_entry {
	     // name of the entry. last entry contains NULL
             char       name[32];
}           
===========================


Brief explanation for the fields inside struct ibv_counter_set_description:


counted_type - contain id for which this counter_set is related to.
the id is a value from ibv_counter_set_counted_type  (see below)
Each counter-set relates to a verbs object, which is the verbs object this 
counter-set aim to count (i.e. measure), such as QP or Flow.


enum  ibv_counter_set_counted_type  {
                IBV_COUNTER_IBV_QP = 0,
                IBV_COUNTER_IBV_FLOW,
                ...
                }

number_of_counter_sets - how many counters does this device supports?
Note that this value can be interpreted in more than one way. Either how many 
counter_sets are currently available or what is the total (max) number of 
counter_sets the device supports. this is seen as the max limit of count-set
which the process is allowed to create.


attributes   - special attributes which this counter-set might have 
either in software or hardware. 
For example we can have cached counter-set. Which means that every query 
for that set is read from the cache. Unless a request to read the values from 
the hardware was specially specified.


enum ibv_counter_set_attributes {
         // the counter-set value is cached by default 
          IBV_COUNTER_ATTR_CACHED                         = 1 <<  1   
};


entries_count - number of entries in the counter_set

struct ibv_counter_entry  entry[256] - an ordered list of counter names
           where the last name in the array is empty (NULL)

====


Example:




ibv_counter_set_description is a struct to describe other structs.
                for example, we have the following struct:


                private struct guy_counter {
                                uint64_t apples_kg;
                                uint64_t apples_count;
                }


                
                - note that all counters are 64bit entries
                
                The ibv_counter_set_desc looks like this:
                   // according to the ibv_counter_set_type enum
                ibv_counter_set_desc.type = IBV_COUNTER_IBV_GUYGUY                                       
                ibv_counter_set_desc.attributes = 0;
                ibv_counter_set_desc.number_of_counter_sets = 1000;
                ibv_counter_set_desc.entry[0].name =  "apples [Kg]"
                ibv_counter_set_desc.entry[1].name =  "apples [Count]"
                ibv_counter_set_desc.entry[2].name =  \0
                
====


How to fill the internal tables?


On initialization - driver should query device capabilities to see how many 
                    counter-set are supported. for each  supported 
                    counter-set the driver will act as following:
1. Allocate counter_set_id 
2. Register counter_set_id with pointer to data structures with list of counters.
5. Finally - The driver returns number of counter_sets supported in ibv_qeury_device_ex()




Operations available for each counter-set:
Each ibv_counter is represented by the following structure:


struct ibv_counter_set {
        struct ibv_context       *context;
        uint64_t                 handle;  
}




THE NEW API


struct ibv_counter_set* ibv_create_counter_set(struct ibv_context *context,  \
                        uint16_t counter_set_id)


Method returns struct ibv_counter_set which contains context+handle.
Actions: Method Allocates memory for struct ibv_counter_set and then calls 
the driver to allocate the actual hardware counter-set.
If successful method returns pointer to struct ibv_counter_set on the heap
which contains context+handle.
If unsuccessful - method returns NULL and set errno accordingly.


int ibv_destroy_counter_set(struct ibv_counter_set* counter_set)


Methods destroys input counter_set and free the allocated memory.
Actions: Method attempts to remove hardware counter-set and then input struct 
is released (deleted). In the kernel the code checks if caller is 
allowed to destroy counter_set (by comparing pid) and then released 
hardware-resource.


If unsuccessful method returns -1 and set errno accordingly. 
If successful method returns 0.


int ibv_query_counter_set(struct ibv_query_counter_set_attr, uint64_t * out)

Method receives query structure and output address, then query the 
hardware and writes output to the uint64_t * out address.
Actions: Method recives  struct ibv_query_counter_set_attr, parse the query
and then send it to execution in kernel.
In the kernel the code checks if caller is allowed to query
the hardware, executes the query and then writes to *out.
If unsuccessful method returns -1 and set errno accordingly. 
If successful method returns 0


Where:


struct ibv_query_counter_set_attr {
                        uint32_t          comp_mask
                        ibv_counter_set   *counter_set;
                        enum ibv_query_counter_set_attr_params  *query_params;
}


enum ibv_query_counter_set_attr_params {
        // force hardware query instead of cached value
        IBV_COUNTER_FORCE_UPDATE                = 1 <<  1  
};




int ibv_query_counter_set_description(struct ibv_context *context, \
                                      uint64_t counter_set_id, \
                                      struct ibv_counter_set_description * out)

Method writes out a struct ibv_counter_set_description which contains a description 
of a counter-set. 
User should allocate sizeof(struct ibv_counter_set_description) for *out;


- return 0 on success
- return -1 when counter_set_id is invalid.




Example on using counter_sets:




void foo(struct ibv_context *context, int counter_set_id) 
{
        // an array of attributes. this is a container of counter-set values
        uint64_t my_counter[256];

	struct ibv_counter_set_description my_description;
        struct ibv_counter_set* my_counter_set ;

	ibv_query_counter_set_description(context, counter_set_id, &my_description);
                
        my_counter_set = ibv_create_counter_set(context,counter_set_id);


        // let's define the query object
        struct ibv_query_counter_set_attr my_query;
        my_query.comp_mask = 0;
        my_query.counter_set = &my_counter_set;
        my_query.query_params = 0;
                
        // finally - do the query.
        
         if(-1 == ibv_query_counter_set(my_query, my_counter)) {
                printf("query failed")
        }
        else  {
                  for(int i = 0 ; i < my_description->entries_count ; ++i)   {
                                 printf("name %d = %lu, \
                        my_description->entries[i].name, my_counter[i]); 
                  }
        }
}
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux