On Tue, Oct 09, 2018 at 04:43:06PM +0200, Christian Ehrhardt wrote: > The default modules config that is processed is kernel-boot/modules/rdma.conf > which does not contain ib_umad (infiniband.conf and opa.conf would). But > no matter what the default configs are - they could be modified by an admin, > due to that today there are cases the service would start ibacm with the module > not loaded. > > That will trigger the service to immediately fail with: > ibacm[1796]: ibwarn: [1796] umad_init: can't read ABI version from > /sys/class/infiniband_mad/abi_version (No such file or directory): is > ib_umad module loaded? > systemd[1]: ibacm.service: Main process exited, code=exited, status=255/n/a Why is ibacm.service even starting? It is supposed to socket activate and cases where we don't have umad shouldn't send any packets to the socket in the first place. So the bug here is that ibacm.service is either starting without a request on the socket, or because there is a request on the socket for roce/iwarp/etc ports. > diff --git a/ibacm/ibacm.service.in b/ibacm/ibacm.service.in > index 23d45250..a8516af5 100644 > +++ b/ibacm/ibacm.service.in > @@ -10,6 +10,8 @@ Wants=rdma-load-modules@rdma.service > After=rdma-load-modules@rdma.service > # Order ibacm startup after basic RDMA hw setup. > After=rdma-hw.target > +# Do not start the service if ib_umad is not loaded or not working > +ConditionPathExists=/sys/class/infiniband_mad/abi_version This is not a good solution, it will not support later hotplug of devices that need ibacm. Jason