> -----Original Message----- > From: Oren Duer [mailto:oren.duer@xxxxxxxxx] General question, was there already an RFC for the kernel side that I missed or is it still pending? > 1) Providing an RNIC the ability not to support non-offloaded operations on an > offloaded QP. > Once the QP will be modified to offloaded only a single SQ WQE with the > connect response will > Be executed and after that no additional SQ WQEs will be allowed from the > consumer. You didn't respond to the requirement of having the ability to not support Non-offloaded operations. Was this missed? Do you see an issue with that? > >> +++ b/libibverbs/man/ibv_map_nvmf_nsid.3 > >> +.B ibv_map_nvmf_nsid() > >> +adds a new NVMe-oF namespace mapping to a given \fInvme_ctrl\fR. > >> +The mapping is from the fabric facing frontend namespace ID > >> +.I fe_nsid > >> +to namespace > >> +.I nvme_nsid > >> +on this NVMe subsystem. > >> +.I fe_nsid > >> +must be unique within the SRQ that > >> +.I nvme_ctrl > >> +belongs to, all ibv_nvme_ctrl objects attached to the same SRQ share > >> the same number space. > >> +.PP > >> +.I lba_data_size > >> +defines the block size this namespace is formatted to, in bytes. Only > >> specific block sizes are supported by the device. > >> +Mapping several namespaces of the same NVMe subsystem will be done by > >> calling this function several times with the same > >> +.I nvme_ctrl > >> +while assigning different > >> +.I fe_nsid > >> +and > >> +.I nvme_nsid > >> +with each call. > > > > Many NVMe controllers require multiple QPs in order to get high bandwidth. > Having > > a single NVMe QP (nvme_ctrl object) for each NVMe-oF namespace might be a > bottleneck. > > Would it be worth getting a list of nvme_ctrl to be used by the device when > accessing this > > namespace? > > You mean many NVMe controllers require multiple SQs in order to get high > BW... > nvme_ctrl should represent a single NVMe device. A single front-facing > NSID will point to a single backend {nvme_ctrl,nsid}. > If that is a requirement, we should probably add multiple SQs to a > single nvme_ctrl, so that hardware can round-robin the requests > between them. > Yes, that is what I had in mind. > > > >> +.SH "RETURN VALUE" > >> +.B ibv_map_nvmf_nsid() > >> +and > >> +.B ibv_unmap_nvmf_nsid > >> +returns 0 on success, or the value of errno on failure (which > >> indicates the failure reason). > >> +.PP > >> +failure reasons may be: > >> +.IP EEXIST > >> +Trying to map an already existing front-facing > >> +.I fe_nsid > >> +.IP ENOENT > >> +Trying to delete a non existing front-facing > >> +.I fe_nsid > >> +.IP ENOTSUP > >> +Given > >> +.I lba_data_size > >> +is not supported on this device. Check device release notes for > >> supported sizes. Format the NVMe namespace to a LBA Format where the > >> data + metadata size is supported by the device. > > > > Why should this not be a capability instead of checking in release notes of > device? > > Yes, thought about making this a capability bitmask, but then it > seemed to many options… besides the regular power-of-two block sizes > there are combinations of metadata. So for instance a device could > support 4096, 4160 (8x 512B blocks + 8B DIF each), 4104 (1x 4K block > with 1 8B DIF), or any other metadata size… > What about two bits maps, one for the block sizes supported (power of two) and one for whether the specific block size with metadata (8B DIF tag) is also supported? > >> diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h > >> index 0785c77..3d32cf4 100644 > >> --- a/libibverbs/verbs.h > >> +++ b/libibverbs/verbs.h > >> @@ -398,6 +420,8 @@ enum ibv_event_type { > >> IBV_EVENT_CLIENT_REREGISTER, > >> IBV_EVENT_GID_CHANGE, > >> IBV_EVENT_WQ_FATAL, > >> + IBV_EVENT_NVME_PCI_ERR, > >> + IBV_EVENT_NVME_TIMEOUT > >> }; > > > > We need a new event to give in case a non-supported operation > > Arrives on an offloaded QP that doesn't support co-existence of > > Non-offloaded operations. E.g. IBV_EVENT_UNSUPPORTED_NVMF_OP > > > > You mean besides the QP going to error? > Yes, that is the way the application will be notified that the QP has moved to error state. ��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f