Re: rdma-core in Dabian

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 12, 2017 at 06:24:35PM +0000, Bart Van Assche wrote:

> When using systemd templates instead of the current approach it would
> become cumbersome for users to stop and start the srp_daemon on all ports
> simultaneously.

I think systemd PartOf dependencies is intended to solve that
problem..

       PartOf=

           Configures dependencies similar to Requires=, but limited
           to stopping and restarting of units. When systemd stops or restarts
           the units listed here, the action is propagated to this
           unit. Note that this is a one-way dependency -- changes to this unit
           do not affect the listed units.

So there could still be a service name that covered all of the
children (eg srp_daemon.service and srp_daemon_port@.service)

> Additionally, if an RDMA adapter is hot-added, should the srp_daemon
> be started or should it not be started for the newly added ports?

I'm not sure what is best for that policy..

Today it looks like srp_daemon defaults to explicitly opt-in to
running on ports. To continue that policy the admin would have to
explicitly opt in to each port. Eg perhaps something like:

 systemctl enable srp_daemon@mlx4_0/0

Instead of editing a /etc/ config file.

systemd has a '.device' system, which as I understand it, would allow
srp_daemon@mlx4_0/0 to depend on mlx4_0/0.device, which is created by
udev on hotplug, and would trigger running the service. 

(perhaps for this reason the template value would be umad0, not mlx4_0/0)

> An in-between program could indeed do the job of listening for udev events
> and actively managing srp_daemon children. Since we already have an
> in-between program, namely srp_daemon.sh, the simplest approach is to modify
> that shell script. Can you have a look at the attached patch?

This seems like a reasonable improvement, but as I said, the inbetween
program is a bit more complex that you probably want to do with a
script:

- Sites do not like things running on timers due to the introduced
  jitter, so the trivial sleep loop is not nice.
- Failure of a child should result in a sensible log message and a
  restart protocol like systemd has. Failure to SIGTERM the child should
  escalate to SIGKILL to prevent lockups during shutdown
- Dumping stderr/etc into /dev/null is a bad idea (eg glibc internal
  asserts are lost now). In a systemd world each child should get its
  own stderrr logger connection so this stuff works right.
- We still need to be able to configure which ports srp_daemon runs
  on and to 'reload' that configuration into the script
- Ideally hot removal of an adaptor requires halting just those
  daemons without triggering restart

Since so much of that overlaps with systemd capabilities it does make
some sense to have systemd manage the child processes directly.

For instance you could have the srp_daemon script like in the patch
but replace this:

  ${prog} -e -c -n [..]
with
  systemctl start srp_daemon_port@${hca_id}/${port}

Using PartOf dependencies.. The user experience would be very similar
to the script directly forking, but many more of the above areas are
covered off..

But.. in that case we don't even need the script. If the admin wishes
to run srp_daemon on all IB ports then a simple udev rule will do the
job:

SUBSYSTEM=="infiniband", TAG+="systemd", ATTRS{node_desc}=="*", ENV{SYSTEMD_WANTS}+="srp_daemon_port@$name.service"

[more or less]

And now the timer is gone too.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux