Re: [PATCH 0/6] Boot-time switching between 4- and 5-level paging for 4.15, Part 1

"Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx> · Tue, 31 Oct 2017 15:04:29 +0300

On Tue, Oct 31, 2017 at 10:47:27AM +0100, Ingo Molnar wrote:
> 
> * Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote:
> > I don't think this design is reasonable.
> > 
> >   - It introduces memory references where we haven't had them before.
> > 
> >     At this point all variable would fit a cache line, which is not that
> >     bad. But I don't see what would stop the list from growing in the
> >     future.
> 
> Is any of these actually in a hotpath?

Probably, no. Closest to hotpath I see so far is page_zone_id() in page
allocator.

> Also, note the context: your changes turn some of these into variables. Yes, I 
> suggest structuring them all and turning them all into variables, exactly because 
> the majority are now dynamic, yet their _naming_ suggests that they are constants.

Another way to put it would be that you suggest significant rework of kernel
machinery based on cosmetic nitpick. :)

> >   - We loose ability to optimize out change with static branches
> >     (cpu_feature_enabled() instead of pgtable_l5_enabled variable).
> > 
> >     It's probably, not that big of an issue here, but if we are going to
> >     use the same approach for other dynamic macros in the patchset, it
> >     might be.
> 
> Here too I think the (vast) majority of the uses here are for bootup/setup/init 
> purposes, where clarity and maintainability of code matters a lot.

I would argue that it makes maintainability worse.

It makes dependencies between values less obvious. For instance, checking
MAXMEM definition on x86-64 makes it obvious that it depends directly
on MAX_PHYSMEM_BITS.

If we would convert MAXMEM to variable, we would need to check where the
variable is initialized and make sure that nobody changes it afterwards.

Does it sound like a win for maintainability?

> >   - AFAICS, it requires changes to all architectures to provide such
> >     structures as we now partly in generic code.
> > 
> >     Or to introduce some kind of compatibility layer, but it would make
> >     the kernel as a whole uglier than cleaner. Especially, given that
> >     nobody beyond x86 need this.
> 
> Yes, all the uses should be harmonized (no compatibility layer) - but as you can 
> see it from the histogram I generated it's a few dozen uses, i.e. not too bad.

Without a compatibility layer, I would need to change every architecture.
It's few dozen patches easily. Not fun.

---------------------------------8<------------------------------------

Putting, my disagreement with the design aside, I try to prototype it.
And stumble an issue that I don't see how to solve.

If we are going to convert macros to variable whether they need to be
variable in the configuration we quickly put ourself into corner:

 - SECTIONS_SHIFT is dependent on MAX_PHYSMEM_BITS.

 - SECTIONS_SHIFT is used to define SECTIONS_WIDTH, but only if
   CONFIG_SPARSEMEM_VMEMMAP is not enabled. SECTIONS_WIDTH is zero
   otherwise.

At this point we can convert both SECTIONS_SHIFT and SECTIONS_WIDTH to
variables.

But SECTIONS_WIDTH used on preprocessor level to determinate NODES_WIDTH,
which used to determinate if we going to define NODE_NOT_IN_PAGE_FLAGS and
the value of LAST_CPUPID_WIDTH.

Making SECTIONS_WIDTH variable breaks the preprocessor logic. But problems
don't stop there:

  - LAST_CPUPID_WIDTH determinate if LAST_CPUPID_NOT_IN_PAGE_FLAGS is defined.

  - LAST_CPUPID_NOT_IN_PAGE_FLAGS is used define struct page and therefore
    cannot be dynamic (read variable).

In my patchset I made X86_5LEVEL select SPARSEMEM_VMEMMAP. It breaks the
chain and SECTIONS_WIDTH is never dynamic.

But how get it work with the design?

I can only think of hack like making machine.physmem.sections.shift a
constant macro if we don't want it dynamic for the configuration and leave
SECTHION_WITH as a constant in generic code.

To me it's ugly as hell.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>