RE: [RFC] IB/sa: Route SA pathrecord query through netlink

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Thu, May 21, 2015 at 07:52:36AM -0600, Wan, Kaike wrote:
> >
> > The general format of the request and response will be the same:
> >
> >   ------------------
> >   | netlink header |
> >   ------------------
> >   |  Data header   |
> >   ------------------
> >   |      Data      |
> >   ------------------
> >
> > The data header contains information about the type of
> > request/response, the status (for response), the type (format) of the
> > data, the total length of the data header + data, and a flags field about the
> request/response or data.
> >
> > Based on the type of the data, the data section may be in different
> > format: a string about the host name to resolve, an IP4/IP6 address, a
> > pathrecord, a user pathrecord (struct ib_user_path_rec), or simply a MAD
> (like our posted patches), etc.
> 
> I think given the new plans this is not really necessary.

Why not? Communication of the ib_mad/ib_sa layer with user space might find using MAD format more natural. It all depends  on where  the kernel component is, or the format of the input data that are available to it.

> 
> >
> > Essentially it can be of any format based on the data type. The key is
> > to document the format so that the kernel and user space can
> > communicate correctly.
> >
> > The details are described below:
> >
> > #define IB_NL_VERSION		0x01
> 
> Change all "IB" to "RDMA"

Will do.

> 
> >
> > #define IB_NL_OP_MASK		0x0F
> > #define IB_NL_OP_RESOLVE	0x01
> > #define IB_NL_OP_QUERY_PATH	0x02
> > #define IB_NL_OP_SET_TIMEOUT	0x03
> > #define IB_NL_OP_ACK		0x80
> >
> > #define IB_NL_STATUS_SUCCESS	0x0000
> > #define IB_NL_STATUS_ENODATA	0x0001
> 
> If we do what Doug suggested should we just make OP 16bits with the high
> bit ACK/NACK?
> 
> Then use the other u8 as a detailed status if needed.
In that case,  Use :

   __u8 version;
   __u8 status;
  __u16 opcode;

We will have more opcode space. 

> 
> >
> > #define IB_NL_DATA_TYPE_INVALID			0x0000
> > #define IB_NL_DATA_TYPE_NAME			0x0001
> > #define IB_NL_DATA_TYPE_ADDRESS_IP		0x0002
> > #define IB_NL_DATA_TYPE_ADDRESS_IP6		0x0003
> > #define IB_NL_DATA_TYPE_PATH_RECORD		0x0004
> > #define IB_NL_DATA_TYPE_USER_PATH_REC		0x0005
> 
> Do we need both PATH_RECORD and USER_PATH_REC?

With subheader definition (In my response to Jason's comments), we need only PATH_RECORD type. I will update my RFC.

> 
> I'm having trouble determining when the OP == QUERY_PATH and the
> DATA_TYPE != PATH_RECORD.
> 
> Why don't we remove "QUERY_PATH" above and allow OP == RESOLVE and
> DATA_TYPE == PATH_RECORD be a "query for path record"?

TRUE.

> 
> > #define IB_NL_DATA_TYPE_MAD			0x0006
> 
> I would drop this for now.
> 
> >
> > #define IB_NL_FLAGS_PATH_GMP			1
> > #define IB_NL_FLAGS_PATH_PRIMARY		(1<<1)
> > #define IB_NL_FLAGS_PATH_ALTERNATE		(1<<2)
> > #define IB_NL_FLAGS_PATH_OUTBOUND		(1<<3)
> > #define IB_NL_FLAGS_PATH_INBOUND		(1<<4)
> > #define IB_NL_FLAGS_PATH_INBOUND_REVERSE 	(1<<5)
> > #define IB_NL_FLAGS_PATH_BIDIRECTIONAL
> 	(IB_PATH_OUTBOUND | IB_PATH_INBOUND_REVERSE)
> > #define IB_NL_FLAGS_QUERY_SA			(1<<31)
> > #define IB_NL_FLAGS_NODELAY			(1<<30)
> >
> > struct ib_nl_data_hdr {
> > 	__u8	version;
> > 	__u8	opcode;
> > 	__u16	status;
> > 	__u16	type;
> > 	__u16	reserved;
> > 	__u32	flags;
> > 	__u32	length;
> 
> The overall message length is in the netlink header.  So keeping in mind
> Jasons comments regarding returning multiple data records.

With this length field, a netlink package could carry multiple requests/responses (opcodes). Each request/response could carry multiple data sections.

> 
> I think this should be the length of individual records with a "num records"
> also specified

With data sections, each section may have different length. Within a data section, when all records have the same size, we can carry the number of records field. 

.  I would much prefer this over the yucky IBTA RMPP method
> of implicit record sizes needing to be divided into the overall message size.
> 
> > };
> 
> Should we have a "class" value in the header somewhere?  With multiple
> user space listeners it could be easier to mux messages with such a value.
> The class/seq would then differentiate the message.

Then use a different multicast group. It does not make sense to have multiple listeners for the same multicast group and they all can serve the same request.

> 
> >
> > struct ib_nl_data {
> > 	struct ib_nl_data_hdr		hdr;
> > 	__u8				data[0];
> > };
> >
> >
> > These defines and structures can be added to file
> > include/upai/rdma/rdma_netlink.h (replace with RDMA_NL prefix) or
> contained in a seperate file (include/upai/rdma/ib_netlink.h ???).
> 
> This is not as important as getting the protocol down.

Minor.

> 
> Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux