On Po 13-03-06 16:28:56, Dave Jones wrote: > On Mon, Mar 13, 2006 at 10:24:20PM +0100, Pavel Machek wrote: > > > > if suspend-to-disk is fast enough, you could just *always* write > > > to disk, even if we're doing S3. If power runs out, you then have a > > > valid resume image on-disk. iirc, this is what Windows does. > > > > Yep, I call that suspend-to-both. It is planned, but not really > > trivial, and I'm a little busy. If someone wants to help.... > > I was thinking a few days ago. With your move of all this stuff to > userspace, if it was done in multiple stages, we could implement > a form of checkpointing this way. It is possible... > So instead of doing the 'suspend to disk/ram' after 'write out all pages', > we just continue. ...but it is not _that_ simple. Preparing video for suspend-to-ram is rather nasty piece of code, and I'd rather not have it ran after system is frozen. Sequence needs to be something like: prepare video for s2ram, vbetool save if neccessary FREEZE SNAPSHOT save image to disk run s2ram immediately after wakeup, s-t-disk signature needs to be removed, otherwise we risk two resumes from one suspend. > Why is this useful ? We've seen bugs reported that only ever bite customers > after they've run their workload for a month. Now, if they had a means > of checkpointing, then when it crashes, they could capture the last image > that landed somewhere, and set that up for more tests/monitoring with kprobes etc > and reproduce those hard-to-reproduce bugs a lot faster. Yes, you can do it, but: 1) each SNAPSHOT takes few seconds, and it is rather disruptive action -- includes console switch. It needs half of memory free. 2) it only snapshots memory. To be able to continue from saved snapshot, you'd need to save swap partition and all mounted filesystems. Maybe you don't need 2) -- like kernel state is enough for you, or maybe you can do some magic with device mapper. Actually I have played with this idea myself. Running system entirely on ramdisk would make periodic snapshots feasible, and it could do tricks like system-level undo. Maybe I'll prepare some demo when I get a time... Its probably going to be toy, through. Pavel -- 113: