Re: [PATCH 2/2] mm, memory_hotplug: remove timeout from __offline_memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2017/9/4 16:21, Michal Hocko wrote:

> From: Michal Hocko <mhocko@xxxxxxxx>
> 
> We have a hardcoded 120s timeout after which the memory offline fails
> basically since the hot remove has been introduced. This is essentially
> a policy implemented in the kernel. Moreover there is no way to adjust
> the timeout and so we are sometimes facing memory offline failures if
> the system is under a heavy memory pressure or very intensive CPU
> workload on large machines.
> 
> It is not very clear what purpose the timeout actually serves. The
> offline operation is interruptible by a signal so if userspace wants

Hi Michal,

If the user know what he should do if migration for a long time,
it is OK, but I don't think all the users know this operation
(e.g. ctrl + c) and the affect.

Thanks,
Xishi Qiu

> some timeout based termination this can be done trivially by sending a
> signal.
> 
> If there is a strong usecase to do this from the kernel then we should
> do it properly and have a it tunable from the userspace with the timeout
> disabled by default along with the explanation who uses it and for what
> purporse.
> 
> Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> ---
>  mm/memory_hotplug.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index c9dcbe6d2ac6..b8a85c11360e 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1593,9 +1593,9 @@ static void node_states_clear_node(int node, struct memory_notify *arg)
>  }
>  
>  static int __ref __offline_pages(unsigned long start_pfn,
> -		  unsigned long end_pfn, unsigned long timeout)
> +		  unsigned long end_pfn)
>  {
> -	unsigned long pfn, nr_pages, expire;
> +	unsigned long pfn, nr_pages;
>  	long offlined_pages;
>  	int ret, node;
>  	unsigned long flags;
> @@ -1633,12 +1633,8 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  		goto failed_removal;
>  
>  	pfn = start_pfn;
> -	expire = jiffies + timeout;
>  repeat:
>  	/* start memory hot removal */
> -	ret = -EBUSY;
> -	if (time_after(jiffies, expire))
> -		goto failed_removal;
>  	ret = -EINTR;
>  	if (signal_pending(current))
>  		goto failed_removal;
> @@ -1711,7 +1707,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  /* Must be protected by mem_hotplug_begin() or a device_lock */
>  int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  {
> -	return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
> +	return __offline_pages(start_pfn, start_pfn + nr_pages);
>  }
>  #endif /* CONFIG_MEMORY_HOTREMOVE */
>  



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux