Re: Proposal for a new algorithm for reading & writing a hibernation image.

Nigel Cunningham <ncunningham@xxxxxxxxxxx> · Tue, 11 May 2010 07:16:59 +1000

Hi Bill.

On 11/05/10 01:54, Bill Davidsen wrote:
> Nigel Cunningham wrote:
>> Hi all.
>>
>> Some discussions with Rafael a while ago (can't find the original
>> message now, sorry) got me thinking about whether there might be a
>> better way of writing a complete image of memory, particularly in the
>> context of KMS breaking existing TuxOnIce algorithms. I finally got
>> around to hammering out the algorithm last night, and thought I'd put
>> it out there for others to comment on, particularly since I'm no
>> expert on fault handling - it may be that what I'm thinking of is
>> impossible on the hardware we support.
>>
>> The algorithm I'm thinking of trying to implement goes as follows:
>>
>> When saving the image
>> =====================
>>
>> 1. Modify driver suspend and resume routines so that the freeing of
>> memory used for the storage of state is separated from restoring the
>> resume methods. This will allow us to get the drivers to save their
>> state prior to writing the image, without needing the memory allocated
>> for this purpose to be atomically copied.
>> 2. Prior to writing any of the image, also set up new 4k page tables
>> such that an attempt to make a change to any of the pages we're about
>> to write to disk will result in a page fault, giving us an opportunity
>> to flag the page as needing an atomic copy later. Once this is done,
>> write protection for the page can be disabled and the write that
>> caused the fault allowed to proceed.
>> 3. Write the entire contents of memory to disk.
>> 4. Disable secondary CPUs (no need to do the driver suspend/resume
>> again) and atomically copy pages that faulted while writing the image.
>> 5. Write atomically copied data to disk, giving a complete image on
>> disk of memory at the time of the atomic copy.
>>
>> When loading the image
>> ======================
>> 1. Locate and allocate pages that can have data directly loaded (ie
>> are free now and used in the saved image). These will be loaded
>> without an 'atomic restore'.
>> 2. For other pages:
>> As each page is loaded:
>> - Write protect existing data.
>> - If contents are the same as what is being loaded
>> Discard loaded version
>> If contents change after being write protected,
>> 1. make a copy of unmodified version to later atomically copy back.
>> 2. remove write protection
>> - If contents differ
>> 1. set up atomic restore later
>> 2. remove write protection
>> 3. After loading memory and determining what needs to be atomically
>> restored:
>> - Do drivers suspend, atomic restore as is done at the moment
>>
>> The main difficulties I see with the above are - apart from not being
>> sure that I can achieve the above with fault handling - are:
>>
>> 1. Memory requirements for the atomic copy wouldn't be known until the
>> point where we get to the atomic copy. I guess, though, that with most
>> things frozen, we'd expect the number to be reasonably consistent and
>> small.
>> 2. We also need extra memory for the driver suspend at resume time.
>> That said, since it's not otherwise needed, it could be the same
>> memory that's reserved for doing I/O and for atomically copied data
>> when writing the image.
>>
>> Are there other issues people can see that I might have missed?
>>
> I doubt you "missed" considering compression, but you didn't mention it.

Yeah. I was just focusing on the method of ensuring we get a consistent 
image. I'd be seeking in the first instance to modify the existing 
TuxOnIce code to work this way, so it would still have multithreaded 
I/O, compression and so on.

What I really want to do is work on patches to improve swsusp, but I 
have to keep the existing TuxOnIce users happy too :)

Regards,

Nigel
_______________________________________________
linux-pm mailing list
linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/linux-pm