On 07/08/2019 15:56, Matthew Wilcox wrote: > On Wed, Aug 07, 2019 at 03:30:38PM +0100, Steven Price wrote: >> On 07/08/2019 15:15, Matthew Wilcox wrote: >>> On Tue, Aug 06, 2019 at 11:40:00PM -0700, Christoph Hellwig wrote: >>>> On Tue, Aug 06, 2019 at 12:09:38PM -0700, Matthew Wilcox wrote: >>>>> Has anyone looked at turning the interface inside-out? ie something like: >>>>> >>>>> struct mm_walk_state state = { .mm = mm, .start = start, .end = end, }; >>>>> >>>>> for_each_page_range(&state, page) { >>>>> ... do something with page ... >>>>> } >>>>> >>>>> with appropriate macrology along the lines of: >>>>> >>>>> #define for_each_page_range(state, page) \ >>>>> while ((page = page_range_walk_next(state))) >>>>> >>>>> Then you don't need to package anything up into structs that are shared >>>>> between the caller and the iterated function. >>>> >>>> I'm not an all that huge fan of super magic macro loops. But in this >>>> case I don't see how it could even work, as we get special callbacks >>>> for huge pages and holes, and people are trying to add a few more ops >>>> as well. >>> >>> We could have bits in the mm_walk_state which indicate what things to return >>> and what things to skip. We could (and probably should) also use different >>> iterator names if people actually want to iterate different things. eg >>> for_each_pte_range(&state, pte) as well as for_each_page_range(). >>> >> >> The iterator approach could be awkward for the likes of my generic >> ptdump implementation[1]. It would require an iterator which returns all >> levels and allows skipping levels when required (to prevent KASAN >> slowing things down too much). So something like: >> >> start_walk_range(&state); >> for_each_page_range(&state, page) { >> switch(page->level) { >> case PTE: >> ... >> case PMD: >> if (...) >> skip_pmd(&state); >> ... >> case HOLE: >> .... >> ... >> } >> } >> end_walk_range(&state); >> >> It seems a little fragile - e.g. we wouldn't (easily) get type checking >> that you are actually treating a PTE as a pte_t. The state mutators like >> skip_pmd() also seem a bit clumsy. > > Once you're on-board with using a state structure, you can use it in all > kinds of fun ways. For example: > > struct mm_walk_state { > struct mm_struct *mm; > unsigned long start; > unsigned long end; > unsigned long curr; > p4d_t p4d; > pud_t pud; > pmd_t pmd; > pte_t pte; > enum page_entry_size size; > int flags; > }; > > For this user, I'd expect something like ... > > DECLARE_MM_WALK_FLAGS(state, mm, start, end, > MM_WALK_HOLES | MM_WALK_ALL_SIZES); > > walk_each_pte(state) { > switch (state->size) { > case PE_SIZE_PTE: > ... > case PE_SIZE_PMD: > if (...(state->pmd)) > continue; You need to be able to signal whether you want to descend into the PMD or skip the entire part of the tree. This was my skip_pmd() function above. > ... > } > } > > There's no need to have start / end walk function calls. > You've got a start walk function (it's your DECLARE_MM_WALK_FLAGS above). The end walk I agree I think you don't actually need it since struct mm_walk_state contains all the state. Steve