On Tue, 08.02.11 13:52, Andrey Borzenkov (arvidjaar@xxxxxxx) wrote: > I am probably the wrong one to ask, but here is what happens when > array is started (from udev perspective) [...] > After this event device goes "plugged" and SYSTEMD_WANTS (if any) are > triggered. But at this point we have zero information about array to > decide anything. [...] > At this point we know it is container, know that it has external > metadata and know that we need external metadata handler (mdmon). But > it is too late for systemd. Kay, do you know why this "change" event is used here? Any chance we can get rid of it? > > > > >> Actually it can be implemented even without mdadm patches; apparently > >> it is possible to suppress normal starting of mdmon by setting > >> MDADM_NO_MDMON=1 > > > > A this point mdmon is simply broken: if glibc or mdmon itself (or any > > lib it is using) is upgraded, then mdmon will keep referencing the old > > .so or binary as long as it is running. This means that the fs these > > files are on cannot be remounted r/o. However mdmon insists on being > > shutdown only after all fs got remounted ro. So you have a cyclic > > ordering loop here: mdmon wants to be shut down after the remount, but > > we need to shut it down before the remount. > > > > Ehh ... > > a) mdmon is perfectly capable of restarting, it is already used to > take over mdmon launched in initrd. The problem is to know when to > restart - i.e. when respective libraries are changed. This is a job > for package management in distribution. It is already employed for > glibc, systemd and some others and can just as well be employed for > mdmon. And this is totally unrelated to systemd :) Really, you are sying there is a synchronous way to make mdmon reexec itself? How does that work? > b) having binary launched off some fs should not prevent this fs to be > remountd ro - binaries are not opened rw If you run a binary and then the package manager replaces it then the running instance will still refer to the old copy and this will have the effect that the file isn't actually deleted until the proces exits/execs. And because that is the way it is the kernel will refuse unmounting of the fs until you terminated/reexeced your process. > > This is unfixable unless a) mdmon learns reexecution of itself without > > losing state (like most init systems so), or b) mdmon would stop > > insisting on being shutdown only after the remount. > > As far as I can tell, both is true today; but remounting is not > enough, unfortunately. So, you are saying we can shut down mdmon without ill effects early? > > In my eyes b) is very much preferebale: It should be possible to shut > > down mdmon like any other service. And if then some md related code > > still needs to be run on late shutdown this should be done from a new > > process. I would be willing to add some hooks for this, so that we can > > execute arbitrary drop-in processes as part of the final shutdown loop. > > mdmon is needed to ensure metadata were correctly updated. So it needs > to exist as long as metadata *may* be updated. For practical purposes > it means - until file system is unmounted and flushed to disks. I am > not sure that remounting ro stops all activity (at least, mounting ro > definitely *writes* to device using some filesystems). Well, the root file systems cannot be unmounted, only remounted. So, is there a way to invoke mdmon so that it flushes all metadata changes to disk and immediately terminates then this should be all we need for a clean solution. We'd then shutdown the normal instances of mdmon down like any other daemon and simply invoke this metadata flushing command as part of late shutdown. Lennart -- Lennart Poettering - Red Hat, Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html