On Fri, Feb 21, 2020 at 11:46:58AM +0100, Jack Wang wrote: > From: Jack Wang <jinpu.wang@xxxxxxxxxxxxxxx> > > Introduce public header which provides set of API functions to > establish RDMA connections from client to server machine using > RTRS protocol, which manages RDMA connections for each session, > does multipathing and load balancing. > > Main functions for client (active) side: > > rtrs_clt_open() - Creates set of RDMA connections incapsulated > in IBTRS session and returns pointer on RTRS > session object. > rtrs_clt_close() - Closes RDMA connections associated with RTRS > session. > rtrs_clt_request() - Requests zero-copy RDMA transfer to/from > server. > > Main functions for server (passive) side: > > rtrs_srv_open() - Starts listening for RTRS clients on specified > port and invokes RTRS callbacks for incoming > RDMA requests or link events. > rtrs_srv_close() - Closes RTRS server context. > > Signed-off-by: Danil Kipnis <danil.kipnis@xxxxxxxxxxxxxxx> > Signed-off-by: Jack Wang <jinpu.wang@xxxxxxxxxxxxxxx> > --- > drivers/infiniband/ulp/rtrs/rtrs.h | 310 +++++++++++++++++++++++++++++ > 1 file changed, 310 insertions(+) > create mode 100644 drivers/infiniband/ulp/rtrs/rtrs.h > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs.h b/drivers/infiniband/ulp/rtrs/rtrs.h > new file mode 100644 > index 000000000000..5e1c8a654e92 > --- /dev/null > +++ b/drivers/infiniband/ulp/rtrs/rtrs.h > @@ -0,0 +1,310 @@ > +/* SPDX-License-Identifier: GPL-2.0-or-later */ > +/* > + * RDMA Transport Layer > + * > + * Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved. > + * Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved. > + * Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved. > + */ > +#ifndef RTRS_H > +#define RTRS_H > + > +#include <linux/socket.h> > +#include <linux/scatterlist.h> > + > +struct rtrs_permit; > +struct rtrs_clt; > +struct rtrs_srv_ctx; > +struct rtrs_srv; > +struct rtrs_srv_op; > + > +/* > + * RDMA transport (RTRS) client API > + */ > + > +/** > + * enum rtrs_clt_link_ev - Events about connectivity state of a client > + * @RTRS_CLT_LINK_EV_RECONNECTED Client was reconnected. > + * @RTRS_CLT_LINK_EV_DISCONNECTED Client was disconnected. > + */ > +enum rtrs_clt_link_ev { > + RTRS_CLT_LINK_EV_RECONNECTED, > + RTRS_CLT_LINK_EV_DISCONNECTED, > +}; > + > +/** > + * Source and destination address of a path to be established > + */ > +struct rtrs_addr { > + struct sockaddr_storage *src; > + struct sockaddr_storage *dst; > +}; > + > +typedef void (link_clt_ev_fn)(void *priv, enum rtrs_clt_link_ev ev); > +/** > + * rtrs_clt_open() - Open a session to an RTRS server > + * @priv: User supplied private data. > + * @link_ev: Event notification callback function for connection state changes > + * @priv: User supplied data that was passed to rtrs_clt_open() > + * @ev: Occurred event > + * @sessname: name of the session > + * @paths: Paths to be established defined by their src and dst addresses > + * @path_cnt: Number of elements in the @paths array > + * @port: port to be used by the RTRS session > + * @pdu_sz: Size of extra payload which can be accessed after permit allocation. > + * @max_inflight_msg: Max. number of parallel inflight messages for the session > + * @max_segments: Max. number of segments per IO request > + * @reconnect_delay_sec: time between reconnect tries > + * @max_reconnect_attempts: Number of times to reconnect on error before giving > + * up, 0 for * disabled, -1 for forever > + * > + * Starts session establishment with the rtrs_server. The function can block > + * up to ~2000ms before it returns. > + * > + * Return a valid pointer on success otherwise PTR_ERR. > + */ > +struct rtrs_clt *rtrs_clt_open(void *priv, link_clt_ev_fn *link_ev, > + const char *sessname, > + const struct rtrs_addr *paths, > + size_t path_cnt, u16 port, > + size_t pdu_sz, u8 reconnect_delay_sec, > + u16 max_segments, > + s16 max_reconnect_attempts); > + > +/** > + * rtrs_clt_close() - Close a session > + * @sess: Session handle. Session is freed upon return. > + */ > +void rtrs_clt_close(struct rtrs_clt *sess); > + > +/** > + * rtrs_permit_to_pdu() - converts rtrs_permit to opaque pdu pointer > + * @permit: RTRS permit pointer, it associates the memory allocation for future > + * RDMA operation. > + */ > +void *rtrs_permit_to_pdu(struct rtrs_permit *permit); > + > +enum { > + RTRS_PERMIT_NOWAIT = 0, > + RTRS_PERMIT_WAIT = 1, > +}; > + > +/** > + * enum rtrs_clt_con_type() type of ib connection to use with a given > + * rtrs_permit > + * @USR_CON - use connection reserved vor "service" messages > + * @IO_CON - use a connection reserved for IO > + */ > +enum rtrs_clt_con_type { > + RTRS_USR_CON, > + RTRS_IO_CON > +}; > + > +/** > + * rtrs_clt_get_permit() - allocates permit for future RDMA operation > + * @sess: Current session > + * @con_type: Type of connection to use with the permit > + * @wait: Wait type > + * > + * Description: > + * Allocates permit for the following RDMA operation. Permit is used > + * to preallocate all resources and to propagate memory pressure > + * up earlier. > + * > + * Context: > + * Can sleep if @wait == RTRS_TAG_WAIT > + */ > +struct rtrs_permit *rtrs_clt_get_permit(struct rtrs_clt *sess, > + enum rtrs_clt_con_type con_type, > + int wait); > + > +/** > + * rtrs_clt_put_permit() - puts allocated permit > + * @sess: Current session > + * @permit: Permit to be freed > + * > + * Context: > + * Does not matter > + */ > +void rtrs_clt_put_permit(struct rtrs_clt *sess, struct rtrs_permit *permit); > + > +typedef void (rtrs_conf_fn)(void *priv, int errno); > +/** > + * rtrs_clt_request() - Request data transfer to/from server via RDMA. > + * > + * @dir: READ/WRITE > + * @conf: callback function to be called as confirmation > + * @sess: Session > + * @permit: Preallocated permit > + * @priv: User provided data, passed back with corresponding > + * @(conf) confirmation. > + * @vec: Message that is sent to server together with the request. > + * Sum of len of all @vec elements limited to <= IO_MSG_SIZE. > + * Since the msg is copied internally it can be allocated on stack. > + * @nr: Number of elements in @vec. > + * @len: length of data sent to/from server > + * @sg: Pages to be sent/received to/from server. > + * @sg_cnt: Number of elements in the @sg > + * > + * Return: > + * 0: Success > + * <0: Error > + * > + * On dir=READ rtrs client will request a data transfer from Server to client. > + * The data that the server will respond with will be stored in @sg when > + * the user receives an %RTRS_CLT_RDMA_EV_RDMA_REQUEST_WRITE_COMPL event. > + * On dir=WRITE rtrs client will rdma write data in sg to server side. > + */ > +int rtrs_clt_request(int dir, rtrs_conf_fn *conf, struct rtrs_clt *sess, > + struct rtrs_permit *permit, void *priv, > + const struct kvec *vec, size_t nr, size_t len, > + struct scatterlist *sg, unsigned int sg_cnt); > + > +/** > + * rtrs_attrs - RTRS session attributes > + */ > +struct rtrs_attrs { > + u32 queue_depth; > + u32 max_io_size; > + u8 sessname[NAME_MAX]; > + struct kobject *sess_kobj; > +}; > + > +/** > + * rtrs_clt_query() - queries RTRS session attributes > + * > + * Returns: > + * 0 on success > + * -ECOMM no connection to the server > + */ > +int rtrs_clt_query(struct rtrs_clt *sess, struct rtrs_attrs *attr); > + > +/* > + * Here goes RTRS server API > + */ > + > +/** > + * enum rtrs_srv_link_ev - Server link events > + * @RTRS_SRV_LINK_EV_CONNECTED: Connection from client established > + * @RTRS_SRV_LINK_EV_DISCONNECTED: Connection was disconnected, all > + * connection RTRS resources were freed. > + */ > +enum rtrs_srv_link_ev { > + RTRS_SRV_LINK_EV_CONNECTED, > + RTRS_SRV_LINK_EV_DISCONNECTED, > +}; > + > +/** > + * rdma_ev_fn(): Event notification for RDMA operations > + * If the callback returns a value != 0, an error message > + * for the data transfer will be sent to the client. > + > + * @sess: Session > + * @priv: Private data set by rtrs_srv_set_sess_priv() > + * @id: internal RTRS operation id > + * @dir: READ/WRITE > + * @data: Pointer to (bidirectional) rdma memory area: > + * - in case of %RTRS_SRV_RDMA_EV_RECV contains > + * data sent by the client > + * - in case of %RTRS_SRV_RDMA_EV_WRITE_REQ points to the > + * memory area where the response is to be written to > + * @datalen: Size of the memory area in @data > + * @usr: The extra user message sent by the client (%vec) > + * @usrlen: Size of the user message > + */ > +typedef int (rdma_ev_fn)(struct rtrs_srv *sess, void *priv, > + struct rtrs_srv_op *id, int dir, > + void *data, size_t datalen, const void *usr, > + size_t usrlen); > + > +/** > + * link_ev_fn(): Events about connectivity state changes > + * If the callback returns != 0 and the event > + * %RTRS_SRV_LINK_EV_CONNECTED the corresponding session > + * will be destroyed. > + * @sess: Session > + * @ev: event > + * @priv: Private data from user if previously set with > + * rtrs_srv_set_sess_priv() > + */ > +typedef int (link_ev_fn)(struct rtrs_srv *sess, enum rtrs_srv_link_ev ev, > + void *priv); I don't think that it is good idea to add typedefs to hide function callbacks definitions. Thanks