Re: [PATCH RFC 0/8] mm: online/offline 4MB chunks controlled by device driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13.04.2018 16:20, Michal Hocko wrote:
> On Fri 13-04-18 16:01:43, David Hildenbrand wrote:
>> On 13.04.2018 15:44, Michal Hocko wrote:
>>> [If you choose to not CC the same set of people on all patches - which
>>> is sometimes a legit thing to do - then please cc them to the cover
>>> letter at least.]
>>>
>>> On Fri 13-04-18 15:16:24, David Hildenbrand wrote:
>>>> I am right now working on a paravirtualized memory device ("virtio-mem").
>>>> These devices control a memory region and the amount of memory available
>>>> via it. Memory will not be indicated via ACPI and friends, the device
>>>> driver is responsible for it.
>>>
>>> How does this compare to other ballooning solutions? And why your driver
>>> cannot simply use the existing sections and maintain subsections on top?
>>>
>>
>> (further down in this mail is a small paragraph about that)
> 
> Sorry, I just stopped right there and didn't even finsh to the end.
> Shame on me! I will do my homework and read it carefully (next week).
> 

Sure, in case you have any questions feel free to ask. And if you are
curious how this is used in practice, let me know and I can post the
current prototype that should run on x86 and s390x.

Have a nice weekend! :)

> [...]
>> "And why your driver cannot simply use the existing sections and
>> maintain subsections on top?"
>>
>> Can you elaborate how that is going to work? What I do as of now, is to
>> remember for each memory block (basically a section because I want to
>> make it as small as possible) which chunks ("subsections") are
>> online/offline. This works just fine. Is this what you are referring to?
> 
> Well, basically yes. I meant to suggest you simply mark pages reserved
> and pull them out. You can reuse some parts of such a struct page for
> your metadata because we should simply ignore those.

I store metadata in a separate structure (basically a uin64_t) right,
because it is easier to track blocks especially when I remove_memory()
again.

Problem with reserved pages is that e.g. kdump will happily think it can
read all pages. So we need some way to indicate that to dumping tools.
Also, offline_pages() has to be thought to not simply offline a memory
section just because a subset of pages has been offlined.

> 
> You still have to allocate memmap for the full section but 128MB
> sections have a nice effect that they fit into a single PMD for
> sparse-vmemmap. So you do not really need to touch mem sections, all you
> need is to keep your metadata on top.

Please keep in mind that we somehow have to get pages out of the system
when trying to remove 4mb chunks. Especially to also make
remove_memory() work once all chunks have been offlined. We cannot use
any current allocator for this ("allocate memory only in a certain
address range"). So the online/offline_pages approach is the cleanest
solution I have found so far. (e.g. offline_pages: isolate+migrate a 4MB
block, flush them out of all data structures, fixup accounting).

Also, please note that the subsection size can very. It could e.g. be
8MB or 16MB. This is not fixed to 4MB. It could be configured
differently by the paravirtualized memory device (e.g. minimum
granularity is 8MB)

The current prototype allows a driver to:
- add/remove >4MB chunks to/from the system cleanly
- add memory blocks that it manages, when needed
- remove memory blocks when no longer needed (removing struct pages)
- teaching kdump not to touch subsections that are offline

Thanks!

-- 

Thanks,

David / dhildenb




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux