Hi. On Thursday 27 September 2007 06:30:36 Joseph Fannin wrote: > On Fri, Sep 21, 2007 at 11:45:12AM +0200, Pavel Machek wrote: > > Hi! > > > > > > > > Sounds doable, as long as you can cope with long command lines (which > > > > shouldn't be a biggie). (If you've got a swapfile or parts of a swap > > > > partition already in use, it can be quite fragmented). > > > > > > Hmm. This is an interesting problem. Sharing a swap file or a swap > > > partition with the actual swap of user space pages does seem to be > > > a limitation of this approach. > > > > > > Although the fact that it is simple to write to a separate file may > > > be a reasonable compensation. > > > > I'm not sure how you'd write it to a separate file. Notice that kjump > > kernel may not mount journalling filesystems, not even > > read-only. (Ext3 replays journal in that case). You could pass block > > numbers from the original kernel... > > The ext3 thing is a bug, the case for which I don't think has been > adequately explained to the ext[34] folks. There should be at least a > no_replay mount flag available, or something. It has ramifications > for more than just hibernation. > > And yeah, I'm gonna bring up the swap files thing again. If you > can hibernate to a swap file, you can hibernate to a dedicated > hibernation file, and vice versa. > > If you can't hibernate to a swap file, then swap files are > effectively unsupported for any system you might want to hibernate. > <handwave> I wonder what embedded folks would think about that > </handwave>. > > But, in my ignorance, I'm not sure even fixing the ext3 bug will > guarantee you consistent metadata so that you can handle a > swap/hibernate file. You can do a sync(), but how do you make that > not race against running processes without the freezer, or blkdev > snapshots? > > I guess uswsusp and the-patch-previously-known-as-suspend2 handle > this somehow, though. > > (It's that same ignorance that has me waiting for someone with > established credit with kernel people to make that argument for the > ext3 bug, so I can hang my own reasons for thinking that it's bad off > of theirs). I haven't looked at swsusp support, but TuxOnIce handles all storage (swap partitions, swap files and ordinary files) by first allocating swap (if we're using swap), then bmapping the storage we're going to use. After that, we can freeze filesystems and processes with impunity. The allocated storage is then viewed as just a collection of bdevs, each with an ordered chain of extents defining which blocks we're going to read/write - a series of tapes if you like. In the image header, we store dev_ts and the block chains, together with the configuration information. As long as the same bdevs are configured at boot time prior to the echo > /sys/power/resume, we're in business. Filesystems don't need to be mounted because we don't use filesystem code anyway. (LVM etc does though in so far as it's needed to make the dev_t match the device again). This matches with what you said above about hibernating to swap files and dedicated hibernation files - TuxOnIce uses exactly the same code to do the i/o to both; the variation is in the code to recognise the image header and allocate/free/bmap storage. <not a filesystem expert> Personally, I don't think ext[34] is broken. If there's data being left in the journal that will need replaying, then mounting without replaying the journal sounds wrong. Perhaps you should instead be arguing that nothing should be left in the journal after a filesystem freeze. But, of course, current code isn't doing a filesystem freeze (just a process freeze) and the kexec guys want to take even that away. </not a filesystem expert> In short, I agree. AFAICS, you need both the process freezer and filesystem freezing to make this thing fly properly. Nigel -- See http://www.tuxonice.net for Howtos, FAQs, mailing lists, wiki and bugzilla info. _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm