On 01/22/2010 11:38 PM, David Cantrell wrote: > On Fri, 22 Jan 2010, Steffen Maier wrote: >> On 01/22/2010 12:21 AM, David Cantrell wrote: >>> Remove linuxrc.s390's usage of /sbin/ip, /sbin/ifconfig, and /sbin/route >>> and instead just write the settings to the ifcfg-DEVICE file, set the >>> device online in sysfs, and start NetworkManager. This prevents a bug > The main goal of this change is to reduce the duplication of effort we > currently have in linuxrc.s390. We use NetworkManager on the other > platforms > to bring up interfaces, so we should use it on s390x too. If it doesn't > work > on s390x, we should fix NetworkManager rather than work around it in > linuxrc.s390. Meanwhile I realized which specific change I really disliked and how to fix it while at the same time accepting most of the removals. Thanks for having been patient. Please see close to the bottom for my idea. Independent of that, if those changes are also planned to go into the rhel6-branch, then I fear we'll face serious regressions. Therefore, I still don't think this is a good fix for 557702 unless you successfully perform a thorough function test. To give you an impression of what that means, here is a rough sketch of what I did before submitting the linuxrc rewrite (and yes many of those test cases revealed flaws): real hsi, layer2, nfs real hsi, layer3, http ctc, nfs lcs, ftp real osa, layer2, ftp real osa, layer3, ftp guestlan osa, layer2, nfs guestlan osa, layer3, http guestlan hsi, ftp real osa, layer2, ipv6, ftp [could fetch stage2 but not further] real osa, layer3, ipv6, ftp [could fetch stage2 but not further] Each of those test cases was tested manually with interactive dialogs (intentionally entering wrong values and correcting parameters or restarting the whole dialog) and once more fully unattended with parm&conf file & kickstart. >>> loader/linuxrc.s390 | 313 >>> ++++----------------------------------------------- >>> 1 files changed, 22 insertions(+), 291 deletions(-) >>> -# sets device up and blocks until device appears to be up >>> -function set_device_up() { >> By removing this, you rely on enough time to have passed until NM gets >> started which is fragile. Reminds me of device_cio_free or dasd_settle >> et al. > > NM continually polls the interfaces, so I don't see the point of this > function. OK, fair enough. >>> - [ -z "$mmtu_was_set" ] && unset MMTU >> >> This is still required for show_parm() to work. I never fully understood >> why this magic is there to distinguish between user input and parm/conf >> file input. However, I wouldn't want to break any old feature that was >> in linuxrc.s390 before I touched it. > > The MMTU variable gets set to "mtu $MTU" and used as an argument to > /sbin/ifconfig and /sbin/ip. NetworkManager handles this fine when it > sees an > MTU variable in the device ifcfg file. The variable above is not related to configuring anything but to control the dialog workflow. Please see below for some more details. > It looks like NM does not do anything with the MACADDR variable > currently. We can patch ifcfg-rh to honor that. This is required for real OSA to work in layer2 mode, which has become the driver default. >>> -function do_network() { >>> -function do_broadcast() { >> Those two functions are there on purpose, to help users with old >> parm/conf file and to acquaint them with the changes we introduced. >> Otherwise they would most probably never get to know that there are >> parameters in their parm/conf files they do not longer need or they >> wonder why they don't get applied any more. Please keep those functions. > > Migration guide. Documentation. I've never heard of a migration guide. Also, the users already have to read installation guide, all release notes for all updates since and including the GA of the major release plus probably the knowledge base. Based on my experience with holding hands-on install workshops, I doubt users would detect the useful fact that some parm file options have changed, if it was documented somewhere. There are already (RHEL 5) features since a while now that I haven't found officially documented anywhere, e.g. "clearpart --initlabel --drives=dasda,dasdb,dasdc" for dasdfmt during kickstart or installation from FCP-attached DVD drive. Therefore, providing the users with hints on the console, where they actually look at improves usability significantly at virtually no cost for development. >>> -function configure_ipv4_gateway() { >>> - if ! tv route add default gw $GATEWAY dev $DEVICE; then >>> - echo $"Trying to reach gateway $GATEWAY..." >>> - if ! ping -c 1 $GATEWAY >& /dev/null; then >>> - echo $"Could not reach your default gateway $GATEWAY" >>> - return 1 >>> - fi >>> - fi >>> - return 0 >>> -} >> Now, we are starting to loose significant parts of usability, which was >> the sole purpose of rewriting linuxrc.s390 in the first place. It is a >> step back towards the old linuxrc which would only ask all parameters >> and then try to configure the network ignoring error cases and not >> helping the user to confirm his/her entries were syntactically and even >> more so semantically correct. > > This is exactly the kind of stuff that NetworkManager will handle for > us. One > thing that led me to removing this code was that I kept hitting a problem > where the "ping -c 1" test on $GATEWAY would only work part of the > time. If I > added "-w 5" for a timeout for the entire run, it worked. But then that's > just for my VM. NetworkManager can take care of this for us and can > report if > the connection is up or not. I admit that we have faced some false positives with those heuristics in linuxrc.s390. However, I wonder what's better: have hints with occasional false positives or have no user support at all regarding network parameters. Additionally some false positives are fixable with minimal effort. We cannot expect every user to be fluent in network stuff. Many probably just enter what they have been told by their NOC department and that is fine. Network install is probably the most common install type on s390. Any algorithmic help we can provide for the users to detect wrong input early reduces the pain of an installation especially for the less experienced users which we would like to support with usability improvements. Please correct me if I'm wrong, but I suspect NM will "only" check for the network link but not beyond such as reachability of the gateway or DNS server. So NM would only account for parts of the usability we had just introduced. That said, I don't consider loosing a few heuristics a show stopper here. > Same explanation as above. Also, the way linuxrc.s390 is currently written > won't allow for dual stack configuration, which we need to support during > installation. We already have the code to collect the values from users > and > NetworkManager can do dual stack configuration of the interface. I > still need > to modify things slightly to allow for dual stack rather than the either/or > approach now. That's a good point. Considering that we may get additional network setup features such as dhcp(6) and IPv6 autoconf, I'm starting to like it. >>> -function handle_dns() { >>> - # - foreach DNS try if server is reachable by one ping >>> -# if nslookup $dnsitem $dnsitem >& /dev/null; then > So the problem I have with this function is that it's entirely too fragile. > Verifying that DNS servers are where we say they are and that we can > look up > things is putting too much trust in DNS during installation. We don't > enforce > this on other platforms and I don't see why we should here. We don't enforce DNS on s390 as well, so no difference here. If and only if the user decides to use DNS by entering DNS server addresses which is optional, then we check if the servers are reachable with ping (the nslookup part is commented). If we wouldn't try to reach the DNS server, the dialog workflow would end with the last user input and when loader or whatever component later in the install process actually tried to resolve names over DNS, it would be too late and the user could no longer correct his input. The only option is then to restart the install all over from the beginning. Huge fun on s390 answering all questions again from scratch. >>> - if [ -n "$mmtu_was_set" ]; then >>> - echo "MMTU=\"$MMTU\"" >>> - elif [ -n "$mtu_was_set" ]; then >>> + if [ -n "$mtu_was_set" ]; then >>> echo "MTU=$MTU" >>> fi >>> -[ -n "$MMTU" ] && mmtu_was_set=$MMTU >> >> What does this change cause? Did you test that this does not break any >> previously available behavior? > > I set an MTU of 4096 during installation and that still works when I use > this > version of linuxrc.s390 that just runs NM. I meant to ask what removing those variables means to the dialog workflow and if this may break the workflow, i.e. "restart dialog" might no longer behave as expected for some parm file options such as MTU. Such workflow control variables are there for a reason and I had thoroughly tested that things work as they did with the old linuxrc. Removing functions that configure L3 aspects of the network, that will be taken care of by NM, is one thing, but you should let other code pieces regarding workflow unmodified unless there are good reasons to modify the workflow. >>> + /usr/sbin/NetworkManager >>> --pid-file=/var/run/NetworkManager/NetworkManager.pid >> >> Would we have to wait for NM as in iface.c:iface_start_NetworkManager() >> for the network to be configured before sshd binds in startinetd? > > Entirely possible. I don't have that now, but need to add it in. Here is my general idea: For all the stuff that NM configures, linuxrc becomes a pure UI without configuring itself. As such it needs all the UI workflow that is in loader/net.c. I think we may start NM early in linuxrc, definitely NOT after the dialog loop was left. Everything between and including the calls to do_hostname and do_searchdns needs to be its own atomic configuration item in the linuxrc UI. After the user has entered everything, we have to give NM time to configure the network device and see with get_connection() if it came up correctly and only then we may continue the dialog workflow with do_dasd. Otherwise, the atomic configuration is started all over at do_hostname to give the user the possibility to correct his input. >> And will loader stop calling iface_start_NetworkManager() or does >> NetworkManager detect itself that it is already running? > > loader will still call iface_start_NetworkManager(), which in turn calls > is_nm_running() and returns if we already have NM running. OK, that's good. I overlooked that part. Thanks for explaining. Steffen Linux on System z Development IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 _______________________________________________ Anaconda-devel-list mailing list Anaconda-devel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/anaconda-devel-list