Re: [PATCH rdma-core 2/4] glue/redhat: add udev/systemd/etc infrastructure bits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/14/2016 7:19 PM, Jason Gunthorpe wrote:
> On Fri, Oct 14, 2016 at 03:21:34PM -0400, Jarod Wilson wrote:

>> diff --git a/glue/redhat/ibacm.service b/glue/redhat/ibacm.service
>> new file mode 100644
>> index 0000000..1cd031a
>> +++ b/glue/redhat/ibacm.service
> 
> Can we just put this in ibacm/ ?
> 
>> +Requires=rdma.service
>> +After=rdma.service opensm.service
> 
> This is the only RH specific thiing I see.. Could we standardize on
> something here and use it on all distros? rdma-available.target?

You can't, unless you rename the rdma.service unit file to something
else.  They are tied in that way.

>> +++ b/glue/redhat/rdma.cxgb3.sys.modprobe
>> @@ -0,0 +1 @@
>> +install cxgb3 /sbin/modprobe --ignore-install cxgb3 $CMDLINE_OPTS && /sbin/modprobe iw_cxgb3
>> diff --git a/glue/redhat/rdma.cxgb4.sys.modprobe b/glue/redhat/rdma.cxgb4.sys.modprobe
>> new file mode 100644
>> index 0000000..44163ab
>> +++ b/glue/redhat/rdma.cxgb4.sys.modprobe
>> @@ -0,0 +1 @@
>> +install cxgb4 /sbin/modprobe --ignore-install cxgb4 $CMDLINE_OPTS && /sbin/modprobe iw_cxgb4
> 
> What are these for? Should they be cross distro? Why are only a few
> drivers this special?

We have one of these for every two (or more) part driver.  They aren't
special, it's just the multipart drivers that are.

>> +++ b/glue/redhat/rdma.fixup-mtrr.awk
>> @@ -0,0 +1,160 @@
>> +# This is a simple script that checks the contents of /proc/mtrr to see if
>> +# the BIOS maker for the computer took the easy way out in terms of
>> +# specifying memory regions when there is a hole below 4GB for PCI access
>> +# and the machine has 4GB or more of RAM.  When the contents of /proc/mtrr
>> +# show a 4GB mapping of write-back cached RAM, minus punch out hole(s) of
>> +# uncacheable regions (the area reserved for PCI access), then it becomes
>> +# impossible for the ib_ipath driver to set write_combining on its PIO
>> +# buffers.  To correct the problem, remap the lower memory region in various
>> +# chunks up to the start of the punch out hole(s), then delete the punch out
>> +# hole(s) entirely as they aren't needed any more.  That way, ib_ipath will
>> +# be able to set write_combining on its PIO memory access region.
> 
> Yuk, a thousand times yuk.
> 
> I thought the mtrr cleanup built into modern kernel took care of this?
> 
> Really though, this needs to be fixed upstream in the kernel :|

It is fixed in recent kernels.  This is a holdover from rhel5 days where
this code made the difference between a 50MBit/s and 850MBit/s qib adapter.

>> diff --git a/glue/redhat/rdma.kernel-init b/glue/redhat/rdma.kernel-init
>> new file mode 100644
>> index 0000000..6cb4732
>> +++ b/glue/redhat/rdma.kernel-init
> 
> I wonder if this could be split into a generic 'load the modules' part
> and a distro specific part? Every distro needs systemd to load the
> extra modules because out auto-loading is broken - IMHO, and that is
> pretty complex unfortunately.

Yes, this probably could be broken out.

>> +errata_58()
>> +{
>> +    # Check AMD chipset issue Errata #58
>> +    if test -x /sbin/lspci && test -x /sbin/setpci; then
>> +	if ( /sbin/lspci -nd 1022:1100 | grep "1100" > /dev/null ) &&
>> +	   ( /sbin/lspci -nd 1022:7450 | grep "7450" > /dev/null ) &&
>> +	   ( /sbin/lspci -nd 15b3:5a46 | grep "5a46" > /dev/null ); then
>> +	    CURVAL=`/sbin/setpci -d 1022:1100 69`
> 
> Another yuk. Why isn't this handled upstream in drivers/pci/quirks.c
> with the rest of the 8131 errata?
> 
> Fortunately I expect all 8131 hardware is long since gone, that chip
> was end-of-manufacturing'd very quickly 2003ish IIRC.

See the above about the mtrr registers.  Same thing here.  Holdover from
long ago.

>> +++ b/glue/redhat/rdma.service
>> @@ -0,0 +1,15 @@
>> +[Unit]
>> +Description=Initialize the iWARP/InfiniBand/RDMA stack in the kernel
>> +Documentation=file:/etc/rdma/rdma.conf
>> +RefuseManualStop=true
>> +DefaultDependencies=false
>> +Conflicts=emergency.target emergency.service
>> +Before=network.target remote-fs-pre.target
> 
> This is an area we really need to cross-distro standardize - we really
> need a set of rdma-*.targets.
> 
> eg
>  rdma-available.target
>    - RDMA hardware is available and all prep is done
>      opensm (if installed) is started, etc
>      Use in place of rdma.service
>   rdma-detected.target
>    - udev detected rdma hardware

It's not that easy, unfortunately.  Creating a target is a big deal.  A
service is not.  And targets don't work the way someone would expect in
systemd.  Putting a Before= tag in a systemd unit file doesn't mean what
I would expect (I think I'm probably fairly common in this regard, but I
could be wrong).  I would have thought it means "Start this unit before
starting the target listed in the Before= line", instead it means "Start
this unit and make sure it finishes before the target in the Before=
line is considered complete".  It can be started after the listed target
is started, but the listed target won't be considered complete until it
is also complete.  This caused me lots of heartache when I was creating
these files :-/

Fortunately, the targets listed in the unit files are pretty standard
(they are part of the systemd upstream), and so I think they can be
cross distro just as they are.

>> +# When we detect a new verbs device is added to the system, set the node
>> +# description on that device
>> +# If rdma-ndd is installed, defer the setting of the node description to it.
>> +SUBSYSTEM=="infiniband", KERNEL=="*", ACTION=="add", TEST!="/usr/sbin/rdma-ndd", RUN+="/bin/bash -c 'sleep 1; echo -n `hostname -s` %k > /sys/class/infiniband/%k/node_desc'"
> 
> Shouldn't this udev drop-in by in the rdma-ndd package?
> 
>> +++ b/glue/redhat/srp_daemon.service
>> @@ -0,0 +1,17 @@
>> +[Unit]
>> +Description=Start or stop the daemon that attaches to SRP devices
>> +Documentation=file:///etc/rdma/rdma.conf file:///etc/srp_daemon.conf
>> +DefaultDependencies=false
>> +Conflicts=emergency.target emergency.service
>> +Requires=rdma.service
>> +Wants=opensm.service
>> +After=rdma.service opensm.service
>> +After=network.target
>> +Before=remote-fs-pre.target
> 
> Also should be common, why does it reference opensm.service?

Because if opensm is running on this host, then it must be up before the
configured srp targets are valid any time there is a non-default subnet
prefix.

>> +
>> +[Service]
>> +Type=simple
>> +ExecStart=/usr/sbin/srp_daemon.sh
> 
> Hurm, someday we have to make better systemd integration for these
> daemons..

There really isn't any better integration to get with our complex
daemons unless we update the daemons themselves to get rid of their
shell script starters...


-- 
Doug Ledford <dledford@xxxxxxxxxx>
    GPG Key ID: 0E572FDD

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux