Re: drm pull for v5.3-rc1

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Wed, 7 Aug 2019 07:56:01 -0700

On Wed, Aug 07, 2019 at 03:30:38PM +0100, Steven Price wrote:
> On 07/08/2019 15:15, Matthew Wilcox wrote:
> > On Tue, Aug 06, 2019 at 11:40:00PM -0700, Christoph Hellwig wrote:
> >> On Tue, Aug 06, 2019 at 12:09:38PM -0700, Matthew Wilcox wrote:
> >>> Has anyone looked at turning the interface inside-out?  ie something like:
> >>>
> >>> 	struct mm_walk_state state = { .mm = mm, .start = start, .end = end, };
> >>>
> >>> 	for_each_page_range(&state, page) {
> >>> 		... do something with page ...
> >>> 	}
> >>>
> >>> with appropriate macrology along the lines of:
> >>>
> >>> #define for_each_page_range(state, page)				\
> >>> 	while ((page = page_range_walk_next(state)))
> >>>
> >>> Then you don't need to package anything up into structs that are shared
> >>> between the caller and the iterated function.
> >>
> >> I'm not an all that huge fan of super magic macro loops.  But in this
> >> case I don't see how it could even work, as we get special callbacks
> >> for huge pages and holes, and people are trying to add a few more ops
> >> as well.
> > 
> > We could have bits in the mm_walk_state which indicate what things to return
> > and what things to skip.  We could (and probably should) also use different
> > iterator names if people actually want to iterate different things.  eg
> > for_each_pte_range(&state, pte) as well as for_each_page_range().
> > 
> 
> The iterator approach could be awkward for the likes of my generic
> ptdump implementation[1]. It would require an iterator which returns all
> levels and allows skipping levels when required (to prevent KASAN
> slowing things down too much). So something like:
> 
> start_walk_range(&state);
> for_each_page_range(&state, page) {
> 	switch(page->level) {
> 	case PTE:
> 		...
> 	case PMD:
> 		if (...)
> 			skip_pmd(&state);
> 		...
> 	case HOLE:
> 		....
> 	...
> 	}
> }
> end_walk_range(&state);
> 
> It seems a little fragile - e.g. we wouldn't (easily) get type checking
> that you are actually treating a PTE as a pte_t. The state mutators like
> skip_pmd() also seem a bit clumsy.

Once you're on-board with using a state structure, you can use it in all
kinds of fun ways.  For example:

struct mm_walk_state {
	struct mm_struct *mm;
	unsigned long start;
	unsigned long end;
	unsigned long curr;
	p4d_t p4d;
	pud_t pud;
	pmd_t pmd;
	pte_t pte;
	enum page_entry_size size;
	int flags;
};

For this user, I'd expect something like ...

	DECLARE_MM_WALK_FLAGS(state, mm, start, end,
				MM_WALK_HOLES | MM_WALK_ALL_SIZES);

	walk_each_pte(state) {
		switch (state->size) {
		case PE_SIZE_PTE:
			... 
		case PE_SIZE_PMD:
			if (...(state->pmd))
				continue;
		...
		}
	}

There's no need to have start / end walk function calls.