On Wed, 2024-12-04 at 12:24 +0100, Christian König wrote: > Am 04.12.24 um 12:09 schrieb Thomas Hellström: > > [SNIP] > > > > > > BTW I really dislike that tt->restore is allocated > > > > > dynamically. > > > > > That > > > > > is > > > > > just another allocation which can cause problems. > > > > > We should probably have all the state necessary for the > > > > > operation > > > > > in > > > > > the > > > > > TT object. > > > > Initially it was done this way. But that meant a pre-allocated > > > > struct > > > > page-pointer array the of 1 << MAX_PAGE_ORDER size (2MiB) for > > > > each > > > > ttm_tt. That lead to a patch to reduce the MAX_PAGE_ORDER to > > > > PMD > > > > size > > > > order, but as you might remember, that needed to be ripped out > > > > because > > > > the PMD size macros aren't constant across all architectures. > > > > IIRC > > > > it > > > > was ARM causing compilation failures, and Linus wasn't happy. > > > Yeah, I do remember that. But I don't fully get why you need this > > > page-pointer array in the first place? > > So the TTM page-pointer array holds the backup handles when backed > > up. > > During recovery, We allocate a (potentially huge) page and populate > > the > > TTM page-pointer array with pointers into that. Meanwhile we need > > to > > keep the backup handles for the recover phase in the restore > > structure, > > and in the middle of the recover phase you might hit an -EINTR. > > I still don't see the problem to be honest. > > What you basically do on recovery is the following: > 1. Allocate a bunch of contiguous memory of order X. > 2. Take the first entry from the page_array, convert that to your > backup > handle and copy the data back into the just allocated contiguous > memory. > 3. Replace the first entry in the page array with the struct page > pointer of the allocated contiguous memory. > 4. Take the next entry from the page_array, convert that to your > backup > handle and copy the data back into the just allocated contiguous > memory. > 5. Replace the next entry in the page_array with the struct page > pointer > + 1 of the allocated contiguous memory. > 6. Repeat until the contiguous memory is fully recovered and we jump > to > 1 again. > > What exactly do you need this pre-allocated struct page-pointer array > of > 1 << MAX_PAGE_ORDER for? > > Sorry, I must really be missing something here. It was like a year or more ago since I put this patch together, so TBH I can't recall the details, other than I'm pretty sure I tried that and decided against it. Could have been that the changes were too invasive, and it's pretty easy to break this code even without invasive changes... However with an accessor function for the old page pointers and one for the new ones I imagine it should be possible. I'll give it a try and see what can be done. /Thomas > > Regards, > Christian. > > > > > Thanks, > > Thomas