[SNIP]
BTW I really dislike that tt->restore is allocated dynamically. That is just another allocation which can cause problems. We should probably have all the state necessary for the operation in the TT object.Initially it was done this way. But that meant a pre-allocated struct page-pointer array the of 1 << MAX_PAGE_ORDER size (2MiB) for each ttm_tt. That lead to a patch to reduce the MAX_PAGE_ORDER to PMD size order, but as you might remember, that needed to be ripped out because the PMD size macros aren't constant across all architectures. IIRC it was ARM causing compilation failures, and Linus wasn't happy.Yeah, I do remember that. But I don't fully get why you need this page-pointer array in the first place?So the TTM page-pointer array holds the backup handles when backed up. During recovery, We allocate a (potentially huge) page and populate the TTM page-pointer array with pointers into that. Meanwhile we need to keep the backup handles for the recover phase in the restore structure, and in the middle of the recover phase you might hit an -EINTR.
I still don't see the problem to be honest.
What you basically do on recovery is the following:
1. Allocate a bunch of contiguous memory of order X.
2. Take the first entry from the page_array, convert that to your backup handle and copy the data back into the just allocated contiguous memory.
3. Replace the first entry in the page array with the struct page pointer of the allocated contiguous memory.
4. Take the next entry from the page_array, convert that to your backup handle and copy the data back into the just allocated contiguous memory.
5. Replace the next entry in the page_array with the struct page pointer + 1 of the allocated contiguous memory.
6. Repeat until the contiguous memory is fully recovered and we jump to 1 again.
What exactly do you need this pre-allocated struct page-pointer array of 1 << MAX_PAGE_ORDER for?
Sorry, I must really be missing something here.
Regards,
Christian.
Thanks, Thomas