Re: + mm-introduce-reported-pages.patch added to -mm tree

David Hildenbrand <david@xxxxxxxxxx> · Thu, 7 Nov 2019 00:33:23 +0100

On 06.11.19 18:48, Alexander Duyck wrote:
On Wed, 2019-11-06 at 17:54 +0100, Michal Hocko wrote:
On Wed 06-11-19 08:35:43, Alexander Duyck wrote:
On Wed, 2019-11-06 at 15:09 +0100, David Hildenbrand wrote:
Am 06.11.2019 um 13:16 schrieb Michal Hocko <mhocko@xxxxxxxxxx>:

I didn't have time to read through newer versions of this patch series
but I remember there were concerns about this functionality being pulled
into the page allocator previously both by me and Mel [1][2]. Have those been
addressed? I do not see an ack from Mel or any other MM people. Is there
really a consensus that we want something like that living in the
allocator?

I don‘t think there is. The discussion is still ongoing (although quiet,
Nitesh is working on a new version AFAIK). I think we should not rush
this.

How much time is needed to get a review? I waited 2 weeks since posting
v12 and the only comments I got on the code were from Andrew. Most of this
hasn't changed much since v10 and that was posted back in mid September. I
have been down to making small tweaks here and there and haven't had any
real critiques on the approach since Mel had the comments about conflicts
with compaction which I addressed by allowing compaction to punt the
reporter out so that it could split and splice the lists as it walked
through them.

Well, people are busy and MM community is not a large one. I cannot
really help you much other than keep poking those people and give
reasonable arguments so they decide to ack your patch.

I get that. But v10 was posted in mid September. Back then we had a
discussion about addressing what Mel had mentioned and I had mentioned
then that I had addressed it by allowing compaction to essentially reset
the reporter to get it out of the list so compaction could do this split
and splice tumbling logic.

I definitely do not intent to nack this work, I just have maintainability
concerns and considering there is an alternative approach that does not
require to touch page allocator internals and which we need to compare
against then I do not really think there is any need to push something
in right away. Or is there any pressing reason to have this merged right
now?

The alternative approach doesn't touch the page allocator, however it
still has essentially the same changes to __free_one_page. I suspect the

Nitesh is working on Michals suggestion to use page isolation instead 
AFAIK - which avoids this.

performance issue seen is mostly due to the fact that because it doesn't
touch the page allocator it is taking the zone lock and probing the page
for each set bit to see if the page is still free. As such the performance
regression seen gets worse the lower the order used for reporting.

Also I suspect Nitesh's patches are also in need of further review. I have
provided feedback however my focus ended up being on more the kernel
panics and 30% performance regression rather than debating architecture.

Please don't take this personally, but I really dislike you taking about 
Niteshs RFCs (!) and pushing for your approach (although it was you that 
was late to the party!) in that way. If there are problems then please 
collaborate and fix instead of using the same wrong arguments over and 
over again.

a) hotplug/sparse zones: I explained a couple of times why we can ignore 
that. There was never a reply from you, yet you keep coming up with 
that. I don't enjoy talking to a wall.

b) Locking optimizations: Come on, these are premature optimizations and 
nothing to dictate your design. *nobody* but you cares about that in an 
initial approach we get upstream. We can always optimize that.

c) Kernel panics: Come on, we are talking about complicated RFCs here 
with moving design decisions. We want do discuss *design* and 
*architecture* here, not *implementation details*.

d) Performance: We want to see a design that fits into the whole 
architecture cleanly, is maintainable, and provides a benefit. Of 
course, performance is relevant, but it certainly should not dictate our 
design of a *virtualization specific optimization feature*. Performance 
is not everything, otherwise please feel free and rewrite the kernel in 
ASM and claim it is better because it is faster.

Again, I do value your review and feedback, but I absolutely do not 
enjoy the way you are trying to push your series here, sorry.

Yes, if we end up finding out that there is real value in your approach, 
nothing speaks against considering it. But please don't try to hurry and 
push your series in that way. Please give everybody to time to evaluate.

--

Thanks,

David / dhildenb