On 26-Nov-18 09:48, Leon Romanovsky wrote: > From: Parav Pandit <parav@xxxxxxxxxxxx> > > Describe ib_core_device, ib_device association and their existence > in net namespaces for backward compatibility. > > Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > --- > Documentation/infiniband/core_devices.txt | 101 ++++++++++++++++++++++ > 1 file changed, 101 insertions(+) > create mode 100644 Documentation/infiniband/core_devices.txt > > diff --git a/Documentation/infiniband/core_devices.txt b/Documentation/infiniband/core_devices.txt > new file mode 100644 > index 000000000000..ff28def49ff4 > --- /dev/null > +++ b/Documentation/infiniband/core_devices.txt > @@ -0,0 +1,101 @@ > +Linux RDMA devices and their sysfs entries > +------------------------------------------ > + > +1. Background > +-------------- > +RDMA networking devices have at least 3 link or transport layers. > +(a) InfiniBand > +(b) RoCE > +(c) iWarp > + > +These networking devices provide kernel bypass for sending/receiving > +data to/from the network. > + > +There are various modes in which these devices are used along with > +other protocols for connection establishment and/or for data transfer. > +Such as, > +(a) rdmacm for connection establishement and verbs for data transfer. "establishement" -> "establishment" > +(b) tcp/ip for connection establishment and verbs for data transfer. > + > +Additionally rdma devices can be shared among multiple net namespaces. > + > +It is also desired to have per net namespace rdma devices as the > +stack matures. > + > +sysfs entries are heavily used for device discovery, statistics and network > +addresses in rdma stack. > + > +Therefore, to have minimal impact on backward compatibility for these 3 > +transports and to provide forward looking method, the following sysfs > +isolation approach is taken. > + > +2. Design > +---------- > + > +For every rdma ib_device, core code creates an ib_core_device in every > +net namespace to give the appearance that the rdma device is present > +in all net namespaces. > +Each ib_core_device owns the sysfs entries in their net namespace. > + > +All ib_core_device(s) points to one owner ib_device using driver_data. > + > +2.1 Shared rdma ib_device view in different net namespaces > +----------------------------------------------------------- > + > + ib_core_device (net_ns_1) > + +--------------+ > + | | > + | device | > + | +----------+ | > + | | | | > + | | | | > + | | | | > + | +----------+ | (init_net) > + | | ib_device > + | *owner-------------------------+------>+--------------------+<--+ > + +--------------+ | | | | > + | | ib_core_device | | > + | | +--------------+ | | > + | | | | | | > + | | | device | | | > + | | | +----------+ | | | > + ib_core_device (net_ns_2) | | | | | | | | > + +--------------+ | | | | | | | | > + | | | | | | | | | | > + | device | | | | +----------+ | | | > + | +----------+ | | | | | | | > + | | | | | | | *owner--------------+ > + | | | | | | +--------------+ | > + | | | | | +--------------------+ > + | +----------+ | | > + | | | > + | *owner------------------------+ > + +--------------+ > + > +2.2 rdma ib_device bound to a net namespace (in future) > +-------------------------------------------------------- > + > +In this mode, when an rdma device is bound to a net namespace, all compat > +sysfs entries will be terminated. sysfs entries will reside in single > +net namespace which device is bound to. > +Thereby having one-to-one mapping and providing isolation of devices > +to their owning net namespace. > + > +(net_ns_1) > +ib_device > ++--------------------+ > +| | > +| | > +| ib_core_device | > +| +--------------+ | > +| | | | > +| | device | | > +| | +----------+ | | > +| | | | | | > +| | | | | | > +| | | | | | > +| | +----------+ | | > +| | | | > +| | *owner | | > +| +--------------+ | > ++--------------------+ >