On Mon, Jul 23, 2018 at 03:58:15PM +0300, Shamir Rabinovitch wrote: > On Mon, Jul 16, 2018 at 11:03:53PM +0300, Yuval Shaia wrote: > > RDMA MAD kernel module (ibcm) disallow more than one MAD-agent for a > > given MAD class. > > This does not go hand-by-hand with qemu pvrdma device's requirements > > where each VM is MAD agent. > > Fix it by adding implementation of RDMA MAD multiplexer service which on > > one hand register as a sole MAD agent with the kernel module and on the > > other hand gives service to more than one VM. Hi Shamir, > > worth to mention on what git & branch you apply this patch. it's qemu > git probably. but it's really need to be in rdma-core git. This is up to rdma-core maintainer, i brought it for review here as a RFC to get feedback for the idea. If it gets to PATCH stage and the idea will be accepted then i will port it to rdma-core tree. > > > > > Design Overview: > > ---------------- > > A server process is registered to UMAD framework (for this to work the > > rdma_cm kernel module needs to be unloaded) and creates a unix socket to > > is it possible to implement same multiplexer in kernel and allow kernel > user MAD support for multiple users at once? looks like better approach > then the umad one.. I'm under the assumption is that others would not like the idea of patching the kernel for this purpose but i might be wrong. > > > listen to incoming request from clients. > > A client process (such as QEMU) connects to this unix socket and > > registers with its own GID. > > > > TX: > > --- > > When client needs to send rdma_cm MAD message it construct it the same > > way as without this multiplexer, i.e. creates a umad packet but this > > time it writes its content to the socket instead of calling umad_send(). > > The server, upon receiving such a message fetch local_comm_id from it so > > a context for this session can be maintain and relay the message to UMAD > > layer by calling umad_send(). > > > > RX: > > --- > > The server creates a worker thread to process incoming rdma_cm MAD > > messages. When an incoming message arrived (umad_recv()) the server, > > depending on the message type (attr_id) looks for target client by > > either searching in gid->fd table or in local_comm_id->fd table. With > > the extracted fd the server relays to incoming message to the client. > > > > Signed-off-by: Yuval Shaia <yuval.shaia@xxxxxxxxxx> > > --- > > Makefile | 3 + > > Makefile.objs | 1 + > > contrib/rdmacm-mux/Makefile.objs | 3 + > > contrib/rdmacm-mux/main.c | 680 +++++++++++++++++++++++++++++++ > > 4 files changed, 687 insertions(+) > > create mode 100644 contrib/rdmacm-mux/Makefile.objs > > create mode 100644 contrib/rdmacm-mux/main.c > > > > diff --git a/Makefile b/Makefile > > index 2da686be33..9ef307ba6e 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -416,6 +416,7 @@ dummy := $(call unnest-vars,, \ > > qga-obj-y \ > > ivshmem-client-obj-y \ > > ivshmem-server-obj-y \ > > + rdmacm-mux-obj-y \ > > libvhost-user-obj-y \ > > vhost-user-scsi-obj-y \ > > vhost-user-blk-obj-y \ > > @@ -717,6 +718,8 @@ vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y) libvhost-user.a > > $(call LINK, $^) > > vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y) libvhost-user.a > > $(call LINK, $^) > > +rdmacm-mux$(EXESUF): $(rdmacm-mux-obj-y) $(COMMON_LDADDS) > > + $(call LINK, $^) > > > > module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak > > $(call quiet-command,$(PYTHON) $< $@ \ > > diff --git a/Makefile.objs b/Makefile.objs > > index 7a9828da28..8a7b6fc7b6 100644 > > --- a/Makefile.objs > > +++ b/Makefile.objs > > @@ -193,6 +193,7 @@ vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS) > > vhost-user-scsi.o-libs := $(LIBISCSI_LIBS) > > vhost-user-scsi-obj-y = contrib/vhost-user-scsi/ > > vhost-user-blk-obj-y = contrib/vhost-user-blk/ > > +rdmacm-mux-obj-y = contrib/rdmacm-mux/ > > > > ###################################################################### > > trace-events-subdirs = > > diff --git a/contrib/rdmacm-mux/Makefile.objs b/contrib/rdmacm-mux/Makefile.objs > > new file mode 100644 > > index 0000000000..416288fc36 > > --- /dev/null > > +++ b/contrib/rdmacm-mux/Makefile.objs > > @@ -0,0 +1,3 @@ > > +CFLAGS += -libumad -Wno-format-truncation > > +rdmacm-mux-obj-y = main.o > > + > > diff --git a/contrib/rdmacm-mux/main.c b/contrib/rdmacm-mux/main.c > > new file mode 100644 > > index 0000000000..cba9f48b00 > > --- /dev/null > > +++ b/contrib/rdmacm-mux/main.c > > @@ -0,0 +1,680 @@ > > +#include "qemu/osdep.h" > > +#include "sys/poll.h" > > +#include "sys/ioctl.h" > > +#include "pthread.h" > > +#include "syslog.h" > > + > > +#include "infiniband/verbs.h" > > +#include "infiniband/umad.h" > > +#include "infiniband/umad_types.h" > > + > > +#define SCALE_US 1000 > > +#define COMMID_TTL 2 /* How many SCALE_US a context of MAD session is saved */ > > +#define SLEEP_SECS 5 /* This is used both in poll() and thread */ > > +#define SERVER_LISTEN_BACKLOG 10 > > +#define MAX_CLIENTS 4096 > > +#define MAD_BUF_SIZE 256 > > +#define MAD_MGMT_CLASS 0x7 > > please take this from umad_types.h, see "UMAD_CLASS_CM". > > > +#define MAD_MGMT_VERSION 2 > > please take this from umad_sa.h, see "UMAD_SA_CLASS_VERSION" > > > +#define MAD_RMPP_VERSION 0 > > in "libibumad/umad_types.h" I see "UMAD_RMPP_VERSION" as 1... please > check.. Thanks Shamir, i will now skip all the implementation-comments as this is RFC and not a PATCH but will sure take them later. > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html