RE: [PATCH 1/2] ibacm: Incorrect usage of BE byte order of MLID attach/detach_mcast()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Hefty, Sean
> Sent: Friday, October 13, 2017 3:37 PM
> To: Ruhl, Michael J <michael.j.ruhl@xxxxxxxxx>
> Cc: linux-rdma@xxxxxxxxxxxxxxx; hal@xxxxxxxxxxxxxxxxxx
> Subject: RE: [PATCH 1/2] ibacm: Incorrect usage of BE byte order of MLID
> attach/detach_mcast()
> 
> > The MLID value passed to ibv_attach/detach_mcast() must be in host
> > byte order.
> >
> > acmp.c incorrectly uses the big endian format when doing a multicast
> > attach/detach (join). Multicast packets are used to do name resolution
> > by the libibacmp library.
> >
> > There are two possible results because of this issue.
> >
> > If a kernel has commit 00b8a3351b2b, the attach will fail with an
> > EINVAL.  ibacm will log this as a failure during the multicast join.
> >
> > If a kernel does not have commit 00b8a3351b2b, the attach will
> > complete successfully.  Packets sent to this address will be dropped
> > because the packet dlid value and the multicast address information
> > given by the attach will not match.
> >
> > Update MLID usage to use the correct byte order.
> >
> > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx>
> > Signed-off-by: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx>
> > ---
> >  ibacm/prov/acmp/src/acmp.c |    4 ++--
> >  1 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/ibacm/prov/acmp/src/acmp.c b/ibacm/prov/acmp/src/acmp.c
> > index aa78416..78d9a29 100644
> > --- a/ibacm/prov/acmp/src/acmp.c
> > +++ b/ibacm/prov/acmp/src/acmp.c
> > @@ -732,7 +732,7 @@ static void acmp_process_join_resp(struct
> > acm_sa_mad *sa_mad)
> >  			acm_log(0, "ERROR - unable to create ah\n");
> >  			goto out;
> >  		}
> > -		ret = ibv_attach_mcast(ep->qp, &mc_rec->mgid, mc_rec-
> > >mlid);
> > +		ret = ibv_attach_mcast(ep->qp, &dest->mgid, dest-
> > >av.dlid);
> >  		if (ret) {
> >  			acm_log(0, "ERROR - unable to attach QP to multicast
> > group\n");
> >  			ibv_destroy_ah(dest->ah);
> > @@ -1429,7 +1429,7 @@ static void acmp_ep_join(struct acmp_ep *ep)
> >
> >  	if (ep->mc_dest[0].state == ACMP_READY && ep->mc_dest[0].ah) {
> >  		ibv_detach_mcast(ep->qp, &ep->mc_dest[0].mgid,
> > -				 be16toh(ep->mc_dest[0].av.dlid));
> > +				 ep->mc_dest[0].av.dlid);
> >  		ibv_destroy_ah(ep->mc_dest[0].ah);
> >  		ep->mc_dest[0].ah = NULL;
> >  	}
> 
> Changes look correct for both patches.
> 
> Acked-by: Sean Hefty <sean.hefty@xxxxxxxxx>
> 
> It would be nice to understand how the code was working in the past.  At
> least ibacm has been able to report cached data.  All nodes would have
> joined the wrong mcast group, but as you mention the dlid in the AV
> wouldn't have matched.  I tried looking back through the ibacm history, but
> didn't see any relevant changes.

I don’t believe that the MLID value was checked.  I added a patch several months ago to verify that the MLID was a mcast lid (in ib_attach_mcast).  And it may be that this is what caused this to stop working.

M

��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux