On Wed, 21 Jun 2006, Benjamin Herrenschmidt wrote: > > So your approach to STD would be something like: > > 1- stop subsystems > 2- driver freeze (in the sense to stop DMA's and other horrors for > snapshot, only some drivers care, most don't) > 3-snapshot Yes. Where "stop subsystems" could well include some things that we don't even do now. > 4-driver thaw, subsystems stay frozen (that is VM, filesystems, > userland) Yes and no. We might actually want to thaw some subsystems too. Obviously, there's no reason to thaw user programs (even if you could wake them up, they couldn't be allowed to make any forward progress that is "visible"), but once you have snapshotted things, you might actually be better off allowing a fair amount of "normal" operations. For example, you might decide that you want to actually _kill_ all user processes at that point, and allow kernel processes that you wanted quiescent for snapshotting to thaw. Once you have built the snapshot image, many of the reasons to freeze are gone - not just for drivers. At that point, the only thing you want to make sure of is that nobody writes to swap any more, and doesn't write to the filesystem (or network, for that matter). > 5-shutdown or driver suspend S4 Not yet. 5 - write snapshot to disk Because ytou need to do that after the thaw, of course. And only _then_ do you actually shutdown or do S4. > The only little possible issue there is that the subsystems being still > stopped, some drivers may need to have a hard time doing 5 if they need > to send requests to their own hardware for things like hard disk > spindown, and they happen to use the block layer request queue for that > (pumping device specific requests into it). I'd wake up all kernel daemons after snapshotting. There's no reason not to, really (kswapd might be a special case, but quite frankly, I think we're better off "turning off swap" than necessarily turning off kswapd itself - ie again, the appropriate level to make sure swap doesn't get dirtied afterwards is likely _higher_ up than the level that actually makes the IO itself happen). Linus