On Mon, Aug 19, 2019 at 09:40:24AM -0400, Hal Rosenstock wrote: > On 8/19/2019 8:21 AM, Jason Gunthorpe wrote: > > On Wed, Aug 14, 2019 at 08:15:07AM -0700, Bart Van Assche wrote: > >> The ib_srpt driver derives its default service GUID from the node GUID > >> of the first encountered HCA. Since that service GUID is passed to > >> ib_cm_listen(), the AGN bits must not be set. Since the AGN bits can > >> be set in the node GUID of RoCE HCAs, filter these bits out. This > >> patch avoids that loading the ib_srpt driver fails as follows for the > >> hns driver: What is the actual problem anyhow? Is some roce GUID using the local bits and overlapping with the IB_CM_ASSIGN_SERVICE_ID? Ie generated VF MAC or something? > >> ib_srpt srpt_add_one(hns_0) failed. > >> > >> Cc: oulijun <oulijun@xxxxxxxxxx> > >> Reported-by: oulijun <oulijun@xxxxxxxxxx> > >> Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx> > >> drivers/infiniband/ulp/srpt/ib_srpt.c | 3 ++- > >> 1 file changed, 2 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c > >> index e25c70a56be6..114bf8d6c82b 100644 > >> +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c > >> @@ -3109,7 +3109,8 @@ static void srpt_add_one(struct ib_device *device) > >> srpt_use_srq(sdev, sdev->port[0].port_attrib.use_srq); > >> > >> if (!srpt_service_guid) > >> - srpt_service_guid = be64_to_cpu(device->node_guid); > >> + srpt_service_guid = be64_to_cpu(device->node_guid) & > >> + ~IB_SERVICE_ID_AGN_MASK; > > > > This seems kind of sketchy, masking bits in the GUID is going to make > > it non-unique.. Should we do this only for roce or something? > > > > Hal, do you have any insight? > > include/rdma/ib_cm.h:#define IB_SERVICE_ID_AGN_MASK > cpu_to_be64(0xFF00000000000000ULL) > > IB_SERVICE_ID_AGN_MASK masks entire first byte of OUI which seems like > too much to me as it contains company related bits. > > Would it work just masking the first 2 bits (local/global and X bits) ? Maybe if we see a local GUID it should generate a random global GUID instead. Jason