James Hogarth wrote: > On 13 February 2017 at 15:35, <m.roth@xxxxxxxxx> wrote: >> My manager tells me a system in the datacenter is down. I go down there, >> and plug in a monitor-on-a-stick and keyboard. It's up, but no network. >> I >> try systemctl restart NetworkManager several times, and ip a shows *no* >> change. >> >> Finally, I do an ifdown, followed by an ifup, and everything's >> wonderful. >> >> My manager thinks that the NM daemon thinks everything's fine, and >> there've been no changes, so it does nothing. He suggests that it might >> have to be stopped, then started, rather than restarted. >> >> This is completely unacceptable behavior, since it leave the system with >> no network connection. Pre-systemd, as we all know, restart *RESTARTED* >> the damn thing. >> >> Is there some Magic (#insert "pixie-dust-sparkles") incantation, either >> restarting NetworkManager, or using nm-cli, to force it to perform the >> expected actions? >> > > > I'd be interested in the journal from the NetworkManager restart as > that's not the way it behaves ... it uses the netlink API to get state > and not it's own internal tracker of state (ie doing an ip link down > will reflect in nmcli output) ... a restart of NetworkManager should > not ignore interfaces but rather bring the system to the on disk > configured state ... and a quick check it doesn't override ExecRestart > in the unit file to do a reload or similar instead ... > > And indeed a quick test in a VM shows nmcli device status correctly > changing between connected and unavailable when doing ip link set eth0 > down/up > > Do note that on a NM based system ifup and ifdown are effectively > aliases to nmcli conn down and nmcli conn up > > nmcli conn down "connection name" will make it disconnected > nmcli conn up "connection mame" will bring it back to connected > > there is a slight interesting difference between using nmcli and ip > link set though ... > > with ip link set down <interface> the interface is marked > administratively down (as if you've pulled the cable) but nmcli conn > down "connection name" will unconfigure the interface but leave it in > an UP state ... just without an IP address etc > > anyway that's just an interesting diversion on behavioural differences > > NM won't change an interface state without some sort of event though > (manual or virtual cable pulled etc), and if you have a case where it > *has* done that then you have found a bug that would be great to get > reported > > TL;DR: cannot reproduce, need logs to determine what happened without > a working crystal ball >From journalctl, I see this happening when I do systemctl restart NetworkManager (much edited) Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> [1486997272.7755] manager: (em1): new Ethernet device (/org/freedesktop/NetworkManager/Devi Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> [1486997272.7791] ifcfg-rh: add connection in-memory (79d3ed9d-cc41-498c-9169-44320e332f68, Feb 13 09:47:52 <servername> systemd[1]: Started Hostname Service. Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> [1486997272.7797] device (em1): state change: unmanaged -> unavailable (reason 'connection- Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> [1486997272.7805] device (em1): state change: unavailable -> disconnected (reason 'connecti <...> eb 13 09:47:52 <servername> NetworkManager[67312]: <info> [1486997272.7986] device (em1): state change: disconnected -> prepare (reason 'none') [30 4 Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> [1486997272.7999] policy: set 'em1' (em1) as default for IPv6 routing and DNS Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> [1486997272.8027] device (em1): state change: prepare -> config (reason 'none') [40 50 0] Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> [1486997272.8034] device (em1): state change: config -> ip-config (reason 'none') [50 70 0] Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> [1486997273.3594] device (em1): state change: ip-config -> ip-check (reason 'none') [70 80 Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> [1486997273.3661] device (em1): state change: ip-check -> secondaries (reason 'none') [80 9 Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> [1486997273.3666] device (em1): state change: secondaries -> activated (reason 'none') [90 Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> [1486997273.3667] manager: NetworkManager state is now CONNECTED_GLOBAL Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> [1486997273.3670] manager: NetworkManager state is now CONNECTED_SITE Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> [1486997273.3670] manager: NetworkManager state is now CONNECTED_GLOBAL Feb 13 09:47:53 <servername> nm-dispatcher[67317]: req:2 'connectivity-change': new request (6 scripts) Feb 13 09:47:53 <servername> nm-dispatcher[67317]: req:2 'connectivity-change': start running ordered scripts... Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> [1486997273.3697] device (em1): Activation: successful, device activated. Note there is no IP address being obtained. Now, when I run ifdown/ifup: Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> [1486997297.6804] device (em1): Activation: starting connection 'em1' (c432eaa1-023b-4f1f-a Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> [1486997297.6809] audit: op="connection-activate" uuid="c432eaa1-023b-4f1f-a7b5-4605ec07195 Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> [1486997297.6810] device (em1): state change: disconnected -> prepare (reason 'none') [30 4 Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> [1486997297.6811] manager: NetworkManager state is now CONNECTING Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> [1486997297.6816] device (em1): state change: prepare -> config (reason 'none') [40 50 0] Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> [1486997297.6858] device (em1): state change: config -> ip-config (reason 'none') [50 70 0] Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> [1486997297.6869] dhcp4 (em1): activation: beginning transaction (timeout in 45 seconds) Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> [1486997297.6900] dhcp4 (em1): dhclient started with pid 67715 Feb 13 09:48:17 <servername> dhclient[67715]: DHCPDISCOVER on em1 to 255.255.255.255 port 67 interval 6 (xid=0x745ba623) Feb 13 09:48:17 <servername> dhclient[67715]: DHCPREQUEST on em1 to 255.255.255.255 port 67 (xid=0x745ba623) Feb 13 09:48:17 <servername> dhclient[67715]: DHCPOFFER from <DHCP server> Feb 13 09:48:17 <servername> dhclient[67715]: DHCPACK from <DHCP server> (xid=0x745ba623) And it then gets an IP address. And looking at /var/log/messages, it *appears* that the restart never invokes the dhclient script, while ifup does. mark _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx https://lists.centos.org/mailman/listinfo/centos