Re: [PATCH rdma-next 00/13] Elastic Fabric Adapter (EFA) driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04-Dec-18 14:04, Gal Pressman wrote:
> Hello all,
> The following patchset introduces the Elastic Fabric Adapter (EFA) driver, that
> was pre-announced by Amazon [1].
> 
> EFA is a networking adapter designed to support user space network
> communication, initially offered in the Amazon EC2 environment. First release
> of EFA supports datagram send/receive operations and does not support
> connection-oriented or read/write operations.
> 
> EFA supports unreliable datagrams (UD) as well as a new unordered, scalable
> reliable datagram protocol (SRD). SRD provides support for reliable datagrams
> and more complete error handling than typical RD, but, unlike RD, it does not
> support ordering nor segmentation. A new queue pair type, IB_QPT_SRD, is added
> to expose this new queue pair type.
> User verbs are supported via a dedicated userspace libfabric provider.
> Kernel verbs and in-kernel services are initially not supported.
> 
> EFA enabled EC2 instances have two different devices allocated, one for ENA
> (netdev) and one for EFA, the two are separate pci devices with no in-kernel
> communication between them.
> 
> Thanks,
> Gal
> 
> [1] https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-elastic-fabric-adapter/
> 
> Gal Pressman (13):
>   RDMA: Add EFA related definitions
>   RDMA/efa: Add EFA device definitions
>   RDMA/efa: Add the PCI device id definitions
>   RDMA/efa: Add the efa.h header file
>   RDMA/efa: Add the efa_com.h file
>   RDMA/efa: Add the com service API definitions
>   RDMA/efa: Add the ABI definitions
>   RDMA/efa: Implement functions that submit and complete admin commands
>   RDMA/efa: Add com command handlers
>   RDMA/efa: Add bitmap allocation service
>   RDMA/efa: Add EFA verbs implementation
>   RDMA/efa: Add the efa module
>   RDMA/efa: Add driver to Kconfig/Makefile
> 
>  MAINTAINERS                                     |    8 +
>  drivers/infiniband/Kconfig                      |    2 +
>  drivers/infiniband/core/verbs.c                 |    2 +
>  drivers/infiniband/hw/Makefile                  |    1 +
>  drivers/infiniband/hw/efa/Kconfig               |   14 +
>  drivers/infiniband/hw/efa/Makefile              |    8 +
>  drivers/infiniband/hw/efa/efa.h                 |  191 +++
>  drivers/infiniband/hw/efa/efa_admin_cmds_defs.h |  783 ++++++++++
>  drivers/infiniband/hw/efa/efa_admin_defs.h      |  135 ++
>  drivers/infiniband/hw/efa/efa_bitmap.c          |   76 +
>  drivers/infiniband/hw/efa/efa_com.c             | 1122 ++++++++++++++
>  drivers/infiniband/hw/efa/efa_com.h             |  139 ++
>  drivers/infiniband/hw/efa/efa_com_cmd.c         |  544 +++++++
>  drivers/infiniband/hw/efa/efa_com_cmd.h         |  217 +++
>  drivers/infiniband/hw/efa/efa_common_defs.h     |   17 +
>  drivers/infiniband/hw/efa/efa_main.c            |  669 +++++++++
>  drivers/infiniband/hw/efa/efa_pci_id_tbl.h      |   25 +
>  drivers/infiniband/hw/efa/efa_regs_defs.h       |  117 ++
>  drivers/infiniband/hw/efa/efa_verbs.c           | 1827 +++++++++++++++++++++++
>  include/rdma/ib_verbs.h                         |    9 +-
>  include/uapi/rdma/efa-abi.h                     |   89 ++
>  21 files changed, 5993 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/infiniband/hw/efa/Kconfig
>  create mode 100644 drivers/infiniband/hw/efa/Makefile
>  create mode 100644 drivers/infiniband/hw/efa/efa.h
>  create mode 100644 drivers/infiniband/hw/efa/efa_admin_cmds_defs.h
>  create mode 100644 drivers/infiniband/hw/efa/efa_admin_defs.h
>  create mode 100644 drivers/infiniband/hw/efa/efa_bitmap.c
>  create mode 100644 drivers/infiniband/hw/efa/efa_com.c
>  create mode 100644 drivers/infiniband/hw/efa/efa_com.h
>  create mode 100644 drivers/infiniband/hw/efa/efa_com_cmd.c
>  create mode 100644 drivers/infiniband/hw/efa/efa_com_cmd.h
>  create mode 100644 drivers/infiniband/hw/efa/efa_common_defs.h
>  create mode 100644 drivers/infiniband/hw/efa/efa_main.c
>  create mode 100644 drivers/infiniband/hw/efa/efa_pci_id_tbl.h
>  create mode 100644 drivers/infiniband/hw/efa/efa_regs_defs.h
>  create mode 100644 drivers/infiniband/hw/efa/efa_verbs.c
>  create mode 100644 include/uapi/rdma/efa-abi.h
> 

Hi Jason,

It looks like the discussion didn't come to a conclusion, I'm trying to come up
with a plan going forward and would like to get your opinion.

I followed the comments and your concerns and I'll try to address them all:
Let me start by making clear that EFA is not an infiniband device, nor it
aspires to being one, but I think it does fit the verbs model.

All technical comments will be fixed.

We can implement an rdma-core (libibverbs) userspace provider with support for
standard UD (including the 40 bytes offset) and SRD QPs through direct verbs.
I'll also add documentation for SRD QP type, even if we end up using it as a
driver QP type.

The create/destroy AH issue will be solved with the sleepable flag,
EFA can return -EOPNOTSUPP when called in an atomic context. When we'll add
kernel verbs we can solve that the same way bnxt driver did (polling for
completion).

The EFA wire protocol is tightly coupled to the wire protocol for EC2’s VPC
software defined network, which Amazon considers one of its proprietary
differentiating features.
We can’t share many of the details of the wire protocol as part of open sourcing
the kernel driver, but are happy to share details on any customer-visible
features, such as guarantees around our SRD protocol.
Since EFA is not designed to be used independently of EC2’s VPC data plane, we
don’t believe the lack of a well-documented wire protocol impacts customers in
any meaningful way.

Kernel verbs are not supported right now, but we do have future plans to support
that.
I know future plans is probably not something you care for, and I can't give it
a time frame right now - but it's not overlooked.
We are driven by our customers, and they have shown interest in this.

We are focused on our customer base, and due to our product offering we haven't
seen customer demand for nvmeof support, which you have set as the bar for the
RDMA subsystem.
I'd really like to avoid implementing things that do not interest our customers
and will not have actual use.

I genuinely believe that EFA belongs in the RDMA subsystem, a lot more than
vfio/anywhere else.
We enforce PDs and MRs in the device, use standard AH registration, remote QP
numbers addressing, packet headers are constructed on the device, etc..
We are not simply hacking our way to userspace through the subsystem.

We can implement the driver in a different subsystem, but I truly believe that
no one will benefit from that.

Thanks,
Gal



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux