Re: [PATCH 2/2] Replace linuxrc.s390 network configuration code with NetworkManager.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/22/2010 11:38 PM, David Cantrell wrote:
> On Fri, 22 Jan 2010, Steffen Maier wrote:
>> On 01/22/2010 12:21 AM, David Cantrell wrote:
>>> Remove linuxrc.s390's usage of /sbin/ip, /sbin/ifconfig, and /sbin/route
>>> and instead just write the settings to the ifcfg-DEVICE file, set the
>>> device online in sysfs, and start NetworkManager.  This prevents a bug

> The main goal of this change is to reduce the duplication of effort we
> currently have in linuxrc.s390.  We use NetworkManager on the other
> platforms
> to bring up interfaces, so we should use it on s390x too.  If it doesn't
> work
> on s390x, we should fix NetworkManager rather than work around it in
> linuxrc.s390.

Meanwhile I realized which specific change I really disliked and how to
fix it while at the same time accepting most of the removals. Thanks for
having been patient. Please see close to the bottom for my idea.

Independent of that, if those changes are also planned to go into the
rhel6-branch, then I fear we'll face serious regressions. Therefore, I
still don't think this is a good fix for 557702 unless you successfully
perform a thorough function test. To give you an impression of what that
means, here is a rough sketch of what I did before submitting the
linuxrc rewrite (and yes many of those test cases revealed flaws):

real hsi, layer2, nfs
real hsi, layer3, http
ctc, nfs
lcs, ftp
real osa, layer2, ftp
real osa, layer3, ftp
guestlan osa, layer2, nfs
guestlan osa, layer3, http
guestlan hsi, ftp
real osa, layer2, ipv6, ftp [could fetch stage2 but not further]
real osa, layer3, ipv6, ftp [could fetch stage2 but not further]

Each of those test cases was tested manually with interactive dialogs
(intentionally entering wrong values and correcting parameters or
restarting the whole dialog)
and once more fully unattended with parm&conf file & kickstart.

>>>  loader/linuxrc.s390 |  313
>>> ++++-----------------------------------------------
>>>  1 files changed, 22 insertions(+), 291 deletions(-)

>>> -# sets device up and blocks until device appears to be up
>>> -function set_device_up() {

>> By removing this, you rely on enough time to have passed until NM gets
>> started which is fragile. Reminds me of device_cio_free or dasd_settle
>> et al.
> 
> NM continually polls the interfaces, so I don't see the point of this
> function.

OK, fair enough.

>>> -    [ -z "$mmtu_was_set" ] && unset MMTU
>>
>> This is still required for show_parm() to work. I never fully understood
>> why this magic is there to distinguish between user input and parm/conf
>> file input. However, I wouldn't want to break any old feature that was
>> in linuxrc.s390 before I touched it.
> 
> The MMTU variable gets set to "mtu $MTU" and used as an argument to
> /sbin/ifconfig and /sbin/ip.  NetworkManager handles this fine when it
> sees an
> MTU variable in the device ifcfg file.

The variable above is not related to configuring anything but to control
the dialog workflow. Please see below for some more details.

> It looks like NM does not do anything with the MACADDR variable
> currently.  We can patch ifcfg-rh to honor that.

This is required for real OSA to work in layer2 mode, which has become
the driver default.

>>> -function do_network() {

>>> -function do_broadcast() {

>> Those two functions are there on purpose, to help users with old
>> parm/conf file and to acquaint them with the changes we introduced.
>> Otherwise they would most probably never get to know that there are
>> parameters in their parm/conf files they do not longer need or they
>> wonder why they don't get applied any more. Please keep those functions.
> 
> Migration guide.  Documentation.

I've never heard of a migration guide. Also, the users already have to
read installation guide, all release notes for all updates since and
including the GA of the major release plus probably the knowledge base.
Based on my experience with holding hands-on install workshops, I doubt
users would detect the useful fact that some parm file options have
changed, if it was documented somewhere. There are already (RHEL 5)
features since a while now that I haven't found officially documented
anywhere, e.g. "clearpart --initlabel --drives=dasda,dasdb,dasdc" for
dasdfmt during kickstart or installation from FCP-attached DVD drive.
Therefore, providing the users with hints on the console, where they
actually look at improves usability significantly at virtually no cost
for development.

>>> -function configure_ipv4_gateway() {

>>> -    if ! tv route add default gw $GATEWAY dev $DEVICE; then

>>> -    echo $"Trying to reach gateway $GATEWAY..."

>>> -        if ! ping -c 1 $GATEWAY >& /dev/null; then
>>> -            echo $"Could not reach your default gateway $GATEWAY"
>>> -            return 1
>>> -        fi
>>> -    fi
>>> -    return 0
>>> -}

>> Now, we are starting to loose significant parts of usability, which was
>> the sole purpose of rewriting linuxrc.s390 in the first place. It is a
>> step back towards the old linuxrc which would only ask all parameters
>> and then try to configure the network ignoring error cases and not
>> helping the user to confirm his/her entries were syntactically and even
>> more so semantically correct.
> 
> This is exactly the kind of stuff that NetworkManager will handle for
> us.  One
> thing that led me to removing this code was that I kept hitting a problem
> where the "ping -c 1" test on $GATEWAY would only work part of the
> time.  If I
> added "-w 5" for a timeout for the entire run, it worked.  But then that's
> just for my VM.  NetworkManager can take care of this for us and can
> report if
> the connection is up or not.

I admit that we have faced some false positives with those heuristics in
linuxrc.s390. However, I wonder what's better: have hints with
occasional false positives or have no user support at all regarding
network parameters. Additionally some false positives are fixable with
minimal effort. We cannot expect every user to be fluent in network
stuff. Many probably just enter what they have been told by their NOC
department and that is fine. Network install is probably the most common
install type on s390. Any algorithmic help we can provide for the users
to detect wrong input early reduces the pain of an installation
especially for the less experienced users which we would like to
support with usability improvements.

Please correct me if I'm wrong, but I suspect NM will "only" check for
the network link but not beyond such as reachability of the gateway or
DNS server. So NM would only account for parts of the usability we had
just introduced. That said, I don't consider loosing a few heuristics a
show stopper here.

> Same explanation as above.  Also, the way linuxrc.s390 is currently written
> won't allow for dual stack configuration, which we need to support during
> installation.  We already have the code to collect the values from users
> and
> NetworkManager can do dual stack configuration of the interface.  I
> still need
> to modify things slightly to allow for dual stack rather than the either/or
> approach now.

That's a good point. Considering that we may get additional network
setup features such as dhcp(6) and IPv6 autoconf, I'm starting to like it.

>>> -function handle_dns() {
>>> -    # - foreach DNS try if server is reachable by one ping

>>> -#                if nslookup $dnsitem $dnsitem >& /dev/null; then

> So the problem I have with this function is that it's entirely too fragile.
> Verifying that DNS servers are where we say they are and that we can
> look up
> things is putting too much trust in DNS during installation.  We don't
> enforce
> this on other platforms and I don't see why we should here.

We don't enforce DNS on s390 as well, so no difference here. If and only
if the user decides to use DNS by entering DNS server addresses which is
optional, then we check if the servers are reachable with ping (the
nslookup part is commented). If we wouldn't try to reach the DNS server,
the dialog workflow would end with the last user input and when loader
or whatever component later in the install process actually tried to
resolve names over DNS, it would be too late and the user could no
longer correct his input. The only option is then to restart the install
all over from the beginning. Huge fun on s390 answering all questions
again from scratch.

>>> -    if [ -n "$mmtu_was_set" ]; then
>>> -        echo "MMTU=\"$MMTU\""
>>> -    elif [ -n "$mtu_was_set" ]; then
>>> +    if [ -n "$mtu_was_set" ]; then
>>>          echo "MTU=$MTU"
>>>      fi

>>> -[ -n "$MMTU" ] && mmtu_was_set=$MMTU
>>
>> What does this change cause? Did you test that this does not break any
>> previously available behavior?
> 
> I set an MTU of 4096 during installation and that still works when I use
> this
> version of linuxrc.s390 that just runs NM.

I meant to ask what removing those variables means to the dialog
workflow and if this may break the workflow, i.e. "restart dialog" might
no longer behave as expected for some parm file options such as MTU.
Such workflow control variables are there for a reason and I had
thoroughly tested that things work as they did with the old linuxrc.
Removing functions that configure L3 aspects of the network, that will
be taken care of by NM, is one thing, but you should let other code
pieces regarding workflow unmodified unless there are good reasons to
modify the workflow.

>>> +    /usr/sbin/NetworkManager
>>> --pid-file=/var/run/NetworkManager/NetworkManager.pid
>>
>> Would we have to wait for NM as in iface.c:iface_start_NetworkManager()
>> for the network to be configured before sshd binds in startinetd?
> 
> Entirely possible.  I don't have that now, but need to add it in.

Here is my general idea: For all the stuff that NM configures, linuxrc
becomes a pure UI without configuring itself. As such it needs all the
UI workflow that is in loader/net.c. I think we may start NM early in
linuxrc, definitely NOT after the dialog loop was left.

Everything between and including the calls to do_hostname and
do_searchdns needs to be its own atomic configuration item in the
linuxrc UI. After the user has entered everything, we have to give NM
time to configure the network device and see with get_connection() if it
came up correctly and only then we may continue the dialog workflow with
do_dasd. Otherwise, the atomic configuration is started all over at
do_hostname to give the user the possibility to correct his input.

>> And will loader stop calling iface_start_NetworkManager() or does
>> NetworkManager detect itself that it is already running?
> 
> loader will still call iface_start_NetworkManager(), which in turn calls
> is_nm_running() and returns if we already have NM running.

OK, that's good. I overlooked that part. Thanks for explaining.

Steffen

Linux on System z Development

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



_______________________________________________
Anaconda-devel-list mailing list
Anaconda-devel-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/anaconda-devel-list


[Index of Archives]     [Kickstart]     [Fedora Users]     [Fedora Legacy List]     [Fedora Maintainers]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]
  Powered by Linux