Le mardi 11 septembre 2018 à 19:54 +0530, Nigel Babu a écrit : > On Tue, Sep 11, 2018 at 7:06 PM Michael Scherer <misc@xxxxxxxxxx> > wrote: > > > And... rescue mode is not working. So the server is down until > > Rackspace fix it. > > > > Can someone disable the freebsd smoke test, as I think our 2nd > > builder > > is not yet building fine ? > > > > > Disabled. Please do not merge any JJB review requests until this is > fixed. So, just to keep people updated on this adventure: - after 3 or 4h, the rescue mode managed to appear. Turn out that it take time to copy the rescue image to the cloud, and I guess no one recently did that for freebsd. Rackspace say "40 minutes", which is still a lot. This morning, I was greeted by a welcoming prompt saying "zfs, can't mount the root" or something, because the rescue mode of rackspace was a bit broken. Trying to fix it, I did reboot out of rescue mode. So I went back to my initial plan: "boot in rescue mode". Thanks to surhuman reflexes I acquired dodging nerfs guns in the office, I manage to hit the "s" key at the right time. While it seems easy, the remote console of rackspace is a bit slow to show the boot loader, the latency from France over the atlantic is a noticable, and the interface disconnect itself every minutes of idling, and every time the video mode is changed (like when it go from POST to bootloader menu). So that was a race between the machine and me, and I won it, with a "#_ " prompt waiting for me. Of course, things would be too simple if that was just that, so / was readonly. And that's freebsd, so "mount -o rw,remount /" didn't work. It didn't work for 2 reasons: - zfs do not work like this ("zfs set readonly=off zroot", for people asking how to do later) - the keyboard was kinda broken. Not broke like "that's a us layout and misc is using a french layout" broken, cause that, I know how to deal with it. More broken with "only a-z keys are working, and the rest are randomly placed somewhere else". It turn out that without '.' and '/', thing can be complicated in the shell. But I am full of ressources, and judicious use of <tab> + <backspace> did let me get what I needed. So I did manage to change the root password, reboot again. Password didn't work the first time (did I mention latency), I retry, it worked. And so, network is fine. But sshd didn't start. Why ? because it check the config, and the config did show a warning on "duplicate line". The upgrade did change sshd config file, adding a ton of comment, and at the end, a duplicated line: Subsystem sftp /usr/libexec/sftp-server And upon removal, things were working. However, I did reboot to test and ... still broken. So back to me racing against the bootloader I guess. (cause the temp password I set is not working again, wonder if there was some fallback due to zfs or something). So while that's ultimately my fault for not reading the 150 lines diff presented by the upgrade prompt, maybe they should have not blocked on a warning, and/or verified before asking me to reboot. I guess the lesson is that I kinda need to write a playbook for that, because 11.0 is around the corner. On a related note, we will be putting the 2nd builder (the one in our DC) as a non voting job, so we can see if this builder work. The main issue is that the builder in the cloud (freebsd0) is a custom manually installed one, and so there is some modification that were not recorded (in fact, all of them). So build work there, but not on a freshly installed ones. Niels have been working on fixing that, but last time I did check (1 month ago), we still had issues that were found after I enabled the builder and this did broke smoke tests (something around installation). I wanted to add a bug for that, but didn't had time yet. So a non voting would be perfect for that. -- Michael Scherer Sysadmin, Community Infrastructure and Platform, OSAS _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-devel