Re: [RFC PATCH] contrib/rdmacm-mux: Add implementation of RDMA User MAD multiplexer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 23, 2018 at 03:58:15PM +0300, Shamir Rabinovitch wrote:
> On Mon, Jul 16, 2018 at 11:03:53PM +0300, Yuval Shaia wrote:
> > RDMA MAD kernel module (ibcm) disallow more than one MAD-agent for a
> > given MAD class.
> > This does not go hand-by-hand with qemu pvrdma device's requirements
> > where each VM is MAD agent.
> > Fix it by adding implementation of RDMA MAD multiplexer service which on
> > one hand register as a sole MAD agent with the kernel module and on the
> > other hand gives service to more than one VM.

Hi Shamir,

> 
> worth to mention on what git & branch you apply this patch. it's qemu
> git probably. but it's really need to be in rdma-core git.

This is up to rdma-core maintainer, i brought it for review here as a RFC
to get feedback for the idea. If it gets to PATCH stage and the idea will
be accepted then i will port it to rdma-core tree.

> 
> > 
> > Design Overview:
> > ----------------
> > A server process is registered to UMAD framework (for this to work the
> > rdma_cm kernel module needs to be unloaded) and creates a unix socket to
> 
> is it possible to implement same multiplexer in kernel and allow kernel
> user MAD support for multiple users at once? looks like better approach
> then the umad one..

I'm under the assumption is that others would not like the idea of patching
the kernel for this purpose but i might be wrong.

> 
> > listen to incoming request from clients.
> > A client process (such as QEMU) connects to this unix socket and
> > registers with its own GID.
> > 
> > TX:
> > ---
> > When client needs to send rdma_cm MAD message it construct it the same
> > way as without this multiplexer, i.e. creates a umad packet but this
> > time it writes its content to the socket instead of calling umad_send().
> > The server, upon receiving such a message fetch local_comm_id from it so
> > a context for this session can be maintain and relay the message to UMAD
> > layer by calling umad_send().
> > 
> > RX:
> > ---
> > The server creates a worker thread to process incoming rdma_cm MAD
> > messages. When an incoming message arrived (umad_recv()) the server,
> > depending on the message type (attr_id) looks for target client by
> > either searching in gid->fd table or in local_comm_id->fd table. With
> > the extracted fd the server relays to incoming message to the client.
> > 
> > Signed-off-by: Yuval Shaia <yuval.shaia@xxxxxxxxxx>
> > ---
> >  Makefile                         |   3 +
> >  Makefile.objs                    |   1 +
> >  contrib/rdmacm-mux/Makefile.objs |   3 +
> >  contrib/rdmacm-mux/main.c        | 680 +++++++++++++++++++++++++++++++
> >  4 files changed, 687 insertions(+)
> >  create mode 100644 contrib/rdmacm-mux/Makefile.objs
> >  create mode 100644 contrib/rdmacm-mux/main.c
> > 
> > diff --git a/Makefile b/Makefile
> > index 2da686be33..9ef307ba6e 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -416,6 +416,7 @@ dummy := $(call unnest-vars,, \
> >                  qga-obj-y \
> >                  ivshmem-client-obj-y \
> >                  ivshmem-server-obj-y \
> > +                rdmacm-mux-obj-y \
> >                  libvhost-user-obj-y \
> >                  vhost-user-scsi-obj-y \
> >                  vhost-user-blk-obj-y \
> > @@ -717,6 +718,8 @@ vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y) libvhost-user.a
> >  	$(call LINK, $^)
> >  vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y) libvhost-user.a
> >  	$(call LINK, $^)
> > +rdmacm-mux$(EXESUF): $(rdmacm-mux-obj-y) $(COMMON_LDADDS)
> > +	$(call LINK, $^)
> >  
> >  module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
> >  	$(call quiet-command,$(PYTHON) $< $@ \
> > diff --git a/Makefile.objs b/Makefile.objs
> > index 7a9828da28..8a7b6fc7b6 100644
> > --- a/Makefile.objs
> > +++ b/Makefile.objs
> > @@ -193,6 +193,7 @@ vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
> >  vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
> >  vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
> >  vhost-user-blk-obj-y = contrib/vhost-user-blk/
> > +rdmacm-mux-obj-y = contrib/rdmacm-mux/
> >  
> >  ######################################################################
> >  trace-events-subdirs =
> > diff --git a/contrib/rdmacm-mux/Makefile.objs b/contrib/rdmacm-mux/Makefile.objs
> > new file mode 100644
> > index 0000000000..416288fc36
> > --- /dev/null
> > +++ b/contrib/rdmacm-mux/Makefile.objs
> > @@ -0,0 +1,3 @@
> > +CFLAGS += -libumad -Wno-format-truncation
> > +rdmacm-mux-obj-y = main.o
> > +
> > diff --git a/contrib/rdmacm-mux/main.c b/contrib/rdmacm-mux/main.c
> > new file mode 100644
> > index 0000000000..cba9f48b00
> > --- /dev/null
> > +++ b/contrib/rdmacm-mux/main.c
> > @@ -0,0 +1,680 @@
> > +#include "qemu/osdep.h"
> > +#include "sys/poll.h"
> > +#include "sys/ioctl.h"
> > +#include "pthread.h"
> > +#include "syslog.h"
> > +
> > +#include "infiniband/verbs.h"
> > +#include "infiniband/umad.h"
> > +#include "infiniband/umad_types.h"
> > +
> > +#define SCALE_US 1000
> > +#define COMMID_TTL 2 /* How many SCALE_US a context of MAD session is saved */
> > +#define SLEEP_SECS 5 /* This is used both in poll() and thread */
> > +#define SERVER_LISTEN_BACKLOG 10
> > +#define MAX_CLIENTS 4096
> > +#define MAD_BUF_SIZE 256
> > +#define MAD_MGMT_CLASS 0x7
> 
> please take this from umad_types.h, see "UMAD_CLASS_CM".
> 
> > +#define MAD_MGMT_VERSION 2
> 
> please take this from umad_sa.h, see "UMAD_SA_CLASS_VERSION"
> 
> > +#define MAD_RMPP_VERSION 0
> 
> in "libibumad/umad_types.h" I see "UMAD_RMPP_VERSION" as 1... please
> check..

Thanks Shamir, i will now skip all the implementation-comments as this is
RFC and not a PATCH but will sure take them later.

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux