A few weeks ago, I received a bug report regarding a Fedora package of ours, it was a request to have its init configuration migrated to systemd. A quick search within our Fedora repo shows systemd has become available starting with FC14, I guess it is about time we adapt our package. So we did so. Service definition is simple enough and the documentation is well done, it was really easy to use systemd to start our application daemon. There is a small lack of functionality within service definition to do exactly what we want at the installation configuration phase, but we've found a solution within systemd (which, while not perfect, works). We now have our main RPM requiring a secondary sysvinit or systemd RPM according to distribution flavor. Nice and easy. Reading about systemd features, I told myself, it could be the right tool to wake up an old project of mine exploiting containers kernel features and have the last Fedora (FC18) running within a container under a fresh kernel (3.9.4). This little project gave satisfactory results with various distributions when I designed and tested it 2 years ago. First I checked it with a standard EL6.4 template (400 Megs) under this new kernel (3.9.4, HOST EL6.4) to see if my tool was still operational. Everything went perfectly. I was ready to test FC18. The selected FC18 template is a very standard one (a 939 MBytes tgz file) which (and this is a key factor) was proved to be fully working "as is" in an openvz container (kernel 2.6.32-042stab076.8). "as is" means that Template was never taylored to be on openvz container (template is used out of the box in openvz container) and could be used to seed a working HOST too. I was not expecting to have it fully working at the first attempt in my own container design, but I was expecting systemd (using systemctl very detailed status) to give me a very good insight about issues which could occur. The real goal was to learn how to use systemd components to diagnose an "in trouble" real system, a kind of flight simulator exercise, so that we would be ready in the future to do quick diagnosis if one of our server in a rack had trouble to boot or reboot with EL7. If this exercise result is positive enough, why not try to install systemd within our current deployment as systemd is sysvinit compatible? The exercise will be considered a success if I was able to log in a FC18 container from a remote location via SSH, the SSH port protected by the container own iptables (a very minimal number of services started, a "safe haven" mode to recover a system from trouble). This small exercise turned out very ugly very quickly, I worked very hard trying all the tricks and bypass I could think about to collect data. To my dismay I was unable to get a predictable behaviour, nor reliable data from systemd, even in the emergency.service mode. After a while, I was forced to face it, systemd won't help me, not even start the system in a minimal mode, I was not able to go beyond kernel level with systemd in control, services started were a total mess and container was totaly lock up, with no exploitable data provided. (Quickly: we had interesting situation within the noisy and cold server room using the emergency.service console such as: $ systemctl start systemd-journald.service --> "unable to comply!" a dependency job for systemd-journald.service failed, see journactl -xn. $ journalctl -xn --> "unable to comply!" No journal files were found ) let's be blunt... from what I have seen: In a perfect world, systemd is obviously a nice gadget, in a real world, systemd is the perfect tools to transform a small problem in a terminal "cascading failure" event. I sent a private email to Lennart about my 'little concern', giving more details and trying to explain as well as I could, suggesting solutions (mainly for brainstorming purpose). Lennart answered quickly, and rejected my "worries" with a wave of the hand. To summarize, his answer was: "systemd can work only as PID1, you are out of spec, we do not support openvz, good luck". Obviously, he didn't understand I wasN'T trying to run systemd on an openvz kernel, but rather on a plain 3.9.4 kernel neither was I requesting help to have FC18 running inside the container, I was rather pointing difficulties with systemd not able to cope with "hostiles conditions" init process duty. Troubles are by definition always 'out of spec'. The part about "systemd can only works as PID1" increased my concerns by an order of magnitude. I ended up asking myself 'what part of this puzzle am I missing?', I digged around in Google about systemd and I was stunned by results, I found my concerns were already expressed multiple time with more talented words than mine and this as early as 2010. Since that time it is my understanding systemd continuously try to resolve problems by increasing its complexity and extending its dependencies and its centrality. this is wrong, this is very very wrong. A program as complex as systemd can't be a mandatory PID1 in an open environment as UNIX. We just defined a new oxymoron: "PID1 systemd". This next paragraph in this email is dedicated to the RedHat person reading this mailing list as part of its "technology watch" duty. ===-- It is my understanding EL7 will include systemd as init process. In the actual working state of systemd and if included within EL7 as mandatory PID1, we won't deploy EL7 within our servers racks. Either we'll stay with EL6 or we'll move to another distribution (or another OS). Adding a kernel type program over a kernel is just moving big trouble troubleshooting process from a 4 solutions matrix (hardware+kernel) to a 8 solutions matrix (hardware+kernel+systemd) needed to be resolved before to be able to access and work on the system. Reading, via Google, tell me I am not the only one contemplating this very dilemma. --===---- BTW and to go a little bit beyond the systemd case, since 1991, FC18 is the very first distribution I was NOT successful in installing on a plain hardware (not speaking about container here, rather very plain hardware with RAID software disks. On the same hardware, same configuration parameters, EL6.X, Magia-3 and slackware-14.0 install is A_OK). I am starting to wonder if we (this "we" include dev contributors and myself too) could be on the wrong path in the way we implement software in Fedora. To summarize, It is very easy to write code for an open platform, far more difficult to write code keeping the platform open. (but this is another story, maybe another time...:-}}). That's all folks.... :-} did I say "Nay" to systemd?. -- A bientôt =========================================================== Jean-Marc Pigeon E-Mail: jmp@xxxxxxx SAFE Inc. Phone: (514) 493-4280 Clement, 'a kiss solution' to get rid of SPAM (at last) Clement' Home base <"http://www.clement.safe.ca"> ===========================================================
<<attachment: smime.p7s>>
-- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel