On Thu 09-04-20 10:12:20, David Hildenbrand wrote: > On 09.04.20 09:59, Michal Hocko wrote: > > On Thu 09-04-20 17:26:01, Michael Ellerman wrote: > >> David Hildenbrand <david@xxxxxxxxxx> writes: > >> > >>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory > >>> blocks as removable"), the user space interface to compute whether a memory > >>> block can be offlined (exposed via > >>> /sys/devices/system/memory/memoryX/removable) has effectively been > >>> deprecated. We want to remove the leftovers of the kernel implementation. > >>> > >>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()), > >>> we'll start by: > >>> 1. Testing if it contains any holes, and reject if so > >>> 2. Testing if pages belong to different zones, and reject if so > >>> 3. Isolating the page range, checking if it contains any unmovable pages > >>> > >>> Using is_mem_section_removable() before trying to offline is not only racy, > >>> it can easily result in false positives/negatives. Let's stop manually > >>> checking is_mem_section_removable(), and let device_offline() handle it > >>> completely instead. We can remove the racy is_mem_section_removable() > >>> implementation next. > >>> > >>> We now take more locks (e.g., memory hotplug lock when offlining and the > >>> zone lock when isolating), but maybe we should optimize that > >>> implementation instead if this ever becomes a real problem (after all, > >>> memory unplug is already an expensive operation). We started using > >>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries: > >>> Implement memory hotplug remove in the kernel"), with the initial > >>> hotremove support of lmbs. > >> > >> It's also not very pretty in dmesg. > >> > >> Before: > >> > >> pseries-hotplug-mem: Attempting to hot-add 10 LMB(s) > >> pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs > >> dlpar: Could not handle DLPAR request "memory add count 10" > > > > Yeah, there is more output but isn't that useful? Or put it differently > > what is the actual problem from having those messages in the kernel log? > > > > From the below you can clearly tell that there are kernel allocations > > which prevent hot remove from happening. > > > > If the overall size of the debugging output is a concern then we can > > think of a way to reduce it. E.g. once you have a couple of pages > > reported then all others from the same block are likely not interesting > > much. > > > > IIRC, we only report one page per block already. (and stop, as we > detected something unmovable) You are right. -- Michal Hocko SUSE Labs