Re: [PATCH v3 00/14] Adding GAUDI NIC code to habanalabs driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Sep 20, 2020 at 10:47:02AM +0200, Greg Kroah-Hartman wrote:
> > If not, what open source userspace are you going to ask them to
> > present to merge the kernel side into misc?
> 
> I don't think that they have a userspace api to their rdma feature from
> what I understand, but I could be totally wrong as I do not know their
> hardware at all, so I'll let them answer this question.

I thought Oded was pretty clear, the goal of this series is to expose
their RDMA HW to userspace. This problem space requires co-mingling
networking and compute at extremely high speed/low overhead. This is
all done in userspace.

We are specifically talking about this in
include/uapi/misc/habanalabs.h:

 /*
  * NIC
  *
  * This IOCTL allows the user to manage and configure the device's NIC ports.
  * The following operations are available:
  * - Create a completion queue
  * - Destroy a completion queue
  * - Wait on completion queue
  * - Poll a completion queue
  * - Update consumed completion queue entries
  * - Set a work queue
  * - Unset a work queue
  *
  * For all operations, the user should provide a pointer to an input structure
  * with the context parameters. Some of the operations also require a pointer to
  * driver regarding how many of the available CQEs were actually
  * processed/consumed. Only then the driver will override them with newer
  * entries.
  * The set WQ operation should provide the device virtual address of the WQ with
  * a matching size for the number of WQs and entries per WQ.
  *
  */
 #define HL_IOCTL_NIC	_IOWR('H', 0x07, struct hl_nic_args)

Which is ibv_create_qp, ibv_create_cq, ibv_poll_cq, etc, etc

Habana has repeatedly described their HW as having multiple 100G RoCE
ports. RoCE is one of the common industry standards that ibverbs
unambiguously is responsible for.

I would be much less annoyed if they were not actively marketing their
product as RoCE RDMA.

Sure there is some argument that their RoCE isn't spec compliant, but
I don't think it excuses the basic principle of our subsystem:

 RDMA HW needs to demonstrate some basic functionality using the
 standard open source userspace software stack.

I don't like this idea of backdooring a bunch of proprietary closed
source RDMA userspace through drivers/misc, and if you don't have a
clear idea how to get something equal for drivers/misc you should not
accept the H_IOCTL_NIC.

Plus RoCE is complicated, there is a bunch of interaction with netdev
and rules related to that that really needs to be respected.

> For anything that _has_ to have a userspace RMDA interface, sure ibverbs
> are the one we are stuck with, but I didn't think that was the issue
> here at all, which is why I wrote the above comments.

I think you should look at the patches #8 through 11:

https://lore.kernel.org/lkml/20200915171022.10561-9-oded.gabbay@xxxxxxxxx/

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux