Re: [PATCH 15/27] mm, page_alloc: Consider dirtyable memory in terms of nodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed 22-06-16 16:15:21, Michal Hocko wrote:
> On Tue 21-06-16 15:15:54, Mel Gorman wrote:
> > Historically dirty pages were spread among zones but now that LRUs are
> > per-node it is more appropriate to consider dirty pages in a node.
> 
> I think this should deserve a note that a behavior for 32b highmem
> systems will change and could lead to early write throttling and
> observable stalls as a result because highmem_dirtyable_memory will
> always return totalhigh_pages regardless of how much is free resp. on
> LRUs so we can overestimate it.
> 
> Highmem is usually used for LRU pages but there are other allocations
> which can use it (e.g. vmalloc). I understand how this is both an
> inherent problem of 32b with a larger high:low ratio and why it is hard
> to at least pretend we can cope with it with node based approach but we
> should at least document it.
> 
> I workaround would be to enable highmem_dirtyable_memory which can lead
> to premature OOM killer for some workloads AFAIR.
[...]
> >  static unsigned long highmem_dirtyable_memory(unsigned long total)
> >  {
> >  #ifdef CONFIG_HIGHMEM
> > -	int node;
> >  	unsigned long x = 0;
> > -	int i;
> > -
> > -	for_each_node_state(node, N_HIGH_MEMORY) {
> > -		for (i = 0; i < MAX_NR_ZONES; i++) {
> > -			struct zone *z = &NODE_DATA(node)->node_zones[i];
> >  
> > -			if (is_highmem(z))
> > -				x += zone_dirtyable_memory(z);
> > -		}
> > -	}

Hmm, I have just noticed that we have NR_ZONE_LRU_ANON resp.
NR_ZONE_LRU_FILE so we can estimate the amount of highmem contribution
to the global counters by the following or similar:

	for_each_node_state(node, N_HIGH_MEMORY) {
		for (i = 0; i < MAX_NR_ZONES; i++) {
			struct zone *z = &NODE_DATA(node)->node_zones[i];

			if (!is_highmem(z))
				continue;

			x += zone_page_state(z, NR_FREE_PAGES) + zone_page_state(z, NR_ZONE_LRU_FILE) - high_wmark_pages(zone);
		}

high wmark reduction would be to emulate the reserve. What do you think?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]