On Sun, 23 Oct 2011 01:00:36 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > On Tue, Feb 8, 2011 at 9:28 AM, Lennart Poettering > <lennart@xxxxxxxxxxxxxx> wrote: > > On Tue, 08.02.11 16:54, Andrey Borzenkov (arvidjaar@xxxxxxx) wrote: > > > >> >> a) mdmon is perfectly capable of restarting, it is already used to > >> >> take over mdmon launched in initrd. The problem is to know when to > >> >> restart - i.e. when respective libraries are changed. This is a job > >> >> for package management in distribution. It is already employed for > >> >> glibc, systemd and some others and can just as well be employed for > >> >> mdmon. And this is totally unrelated to systemd :) > >> > > >> > Really, you are sying there is a synchronous way to make mdmon reexec > >> > itself? How does that work? > >> > > >> > >> I am not sure whether it qualifies as synchronous, but "mdmon > >> --takeover" will kill any existing mdmon for this and start monitoring > >> itself. > > > > I wonder if this is really fully synchronous, i.e. that a) there is no > > point in time where mdmon is not running during this restart and b) the > > mdmom --takeover command returns when the new daemon is fully up, and > > not right-away. > > > >> > Well, the root file systems cannot be unmounted, only remounted. > >> > > >> > So, is there a way to invoke mdmon so that it flushes all metadata > >> > changes to disk and immediately terminates then this should be all we > >> > need for a clean solution. We'd then shutdown the normal instances of > >> > mdmon down like any other daemon and simply invoke this metadata > >> > flushing command as part of late shutdown. > >> > >> > >> Hmm ... it looks like you just need to > >> > >> start mdmon > >> do mdadm --wait-clean > >> > >> After this you can kill mdmon again (assuming decide is no more in > >> use). > > > > > > Well, it would be nice if the md utils would offer something doing this > > without spawning multiple processes and killing them again. > > > > /me wonders why his raid5 resyncs every boot on Fedora 15 and has > found this old thread. > > I'm tempted to: > > 1/ teach ignore_proc() to scan for pid files in /dev/md/ (MDMON_DIR on Fedora) > 2/ arrange for mdadm --wait-clean --scan to be called after all > filesytems have been mounted read only > > ...but a few things strike me. This does not seem to be what was > being proposed above. Systemd does not treat dm devices like a > service and takes care to shut them down explicitly (but in that case > there is an api that it can call). Is it time for a libmd.so, so > systemd can invoke the "--wait-clean --scan" process itself? Probably > simpler to just SIGTERM mdmon and wait for it. > > -- > Dan Hi Dan, could you please explain in a bit more detail exactly what you think it is that is going wrong for you? I don't think it is anything like the original problem, as I don't think you are starting array manually. I think your problem is that 'mdmon' is being killed too early at shutdown. Clear we need to get whatever-kills-user-processes to skip mdmon - maybe by writing the pid to some magic file that 'ignore_proc' already knows about? Ultimately we probably want to get udev to start mdmon for us and have mdadm notice and not start it itself. We also need to get udev to notice arrays that are being reshaped and to start the mdadm which montiors the reshape so that mdadm doesn't have to fork it itself. That should fix the original problem, but I don't think it addresses your problem at all. I don't have a Fedora install so I cannot hunt around to see what is happening. I don't like the idea for a 'libmd.so' at all - certainly not until the problem is properly understood and other solutions (like running scripts) prove ineffective. NeilBrown
Attachment:
signature.asc
Description: PGP signature