Re: [patch 13/13] mm: memcontrol: rewrite uncharge API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 15, 2014 at 04:23:50PM +0200, Michal Hocko wrote:
> On Tue 15-07-14 10:25:45, Michal Hocko wrote:
> [...]
> > diff --git a/Documentation/cgroups/memcg_test.txt b/Documentation/cgroups/memcg_test.txt
> > index bcf750d3cecd..8870b0212150 100644
> > --- a/Documentation/cgroups/memcg_test.txt
> > +++ b/Documentation/cgroups/memcg_test.txt
> [...]
> >  6. Shmem(tmpfs) Page Cache
> > -	Memcg's charge/uncharge have special handlers of shmem. The best way
> > -	to understand shmem's page state transition is to read mm/shmem.c.
> > +	The best way to understand shmem's page state transition is to read
> > +	mm/shmem.c.
> 
> :D
> 
> [...]
> >  7. Page Migration
> > -   	One of the most complicated functions is page-migration-handler.
> > -	Memcg has 2 routines. Assume that we are migrating a page's contents
> > -	from OLDPAGE to NEWPAGE.
> > -
> > -	Usual migration logic is..
> > -	(a) remove the page from LRU.
> > -	(b) allocate NEWPAGE (migration target)
> > -	(c) lock by lock_page().
> > -	(d) unmap all mappings.
> > -	(e-1) If necessary, replace entry in radix-tree.
> > -	(e-2) move contents of a page.
> > -	(f) map all mappings again.
> > -	(g) pushback the page to LRU.
> > -	(-) OLDPAGE will be freed.
> > -
> > -	Before (g), memcg should complete all necessary charge/uncharge to
> > -	NEWPAGE/OLDPAGE.
> > -
> > -	The point is....
> > -	- If OLDPAGE is anonymous, all charges will be dropped at (d) because
> > -          try_to_unmap() drops all mapcount and the page will not be
> > -	  SwapCache.
> > -
> > -	- If OLDPAGE is SwapCache, charges will be kept at (g) because
> > -	  __delete_from_swap_cache() isn't called at (e-1)
> > -
> > -	- If OLDPAGE is page-cache, charges will be kept at (g) because
> > -	  remove_from_swap_cache() isn't called at (e-1)
> > -
> > -	memcg provides following hooks.
> > -
> > -	- mem_cgroup_prepare_migration(OLDPAGE)
> > -	  Called after (b) to account a charge (usage += PAGE_SIZE) against
> > -	  memcg which OLDPAGE belongs to.
> > -
> > -        - mem_cgroup_end_migration(OLDPAGE, NEWPAGE)
> > -	  Called after (f) before (g).
> > -	  If OLDPAGE is used, commit OLDPAGE again. If OLDPAGE is already
> > -	  charged, a charge by prepare_migration() is automatically canceled.
> > -	  If NEWPAGE is used, commit NEWPAGE and uncharge OLDPAGE.
> > -
> > -	  But zap_pte() (by exit or munmap) can be called while migration,
> > -	  we have to check if OLDPAGE/NEWPAGE is a valid page after commit().
> > +
> > +	mem_cgroup_migrate()
> 
> This doesn't tell us anything abouta the page migration. On the other
> hand I am not entirely sure the documentation here is very much helpful.
> There is some outdated information. I wouldn't be opposed to remove
> everything up to "9. Typical Tests." section which should be the primary
> target of the file anyway.

Yeah, documentation of the implementation should be directly in the
source code and this file is kind of pointless.  So all I did there
was remove things that were wrong after my changes.  But I agree it
can probably be removed completely.

> > @@ -382,9 +382,13 @@ static inline int mem_cgroup_swappiness(struct mem_cgroup *mem)
> >  }
> >  #endif
> >  #ifdef CONFIG_MEMCG_SWAP
> > -extern void mem_cgroup_uncharge_swap(swp_entry_t ent);
> > +extern void mem_cgroup_swapout(struct page *page, swp_entry_t entry);
> > +extern void mem_cgroup_uncharge_swap(swp_entry_t entry);
> 
> Wouldn't it be nicer to have those two with symmetric names?
> mem_cgroup_{un}charge_swap?

I thought about that when I wrote them, but their operation is not
actually symmetrical.  The first one migrates a memsw charge from a
page to a swap entry when the page gets reclaimed - rather than when
the swap entry is allocated, the second one uncharges the swap entry
once the swap entry is released.

> > @@ -2760,15 +2752,15 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
> >  		spin_unlock_irq(&zone->lru_lock);
> >  	}
> >  
> > -	mem_cgroup_charge_statistics(memcg, page, anon, nr_pages);
> > -	unlock_page_cgroup(pc);
> > -
> > +	local_irq_disable();
> > +	mem_cgroup_charge_statistics(memcg, page, nr_pages);
> >  	/*
> >  	 * "charge_statistics" updated event counter. Then, check it.
> >  	 * Insert ancestor (and ancestor's ancestors), to softlimit RB-tree.
> >  	 * if they exceeds softlimit.
> >  	 */
> >  	memcg_check_events(memcg, page);
> > +	local_irq_enable();
> 
> preempt_{enable,disbale} should be sufficient for
> mem_cgroup_charge_statistics and memcg_check_events no?
> The first one is about per-cpu accounting (and that should be atomic
> wrt. IRQ on the same CPU) and the later one uses IRQ safe locks down in
> mem_cgroup_update_tree.

How could it be atomic wrt. IRQ on the local CPU when IRQs that modify
the counters can fire on the local CPU?

> > @@ -780,11 +780,14 @@ static int move_to_new_page(struct page *newpage, struct page *page,
> >  		rc = fallback_migrate_page(mapping, newpage, page, mode);
> >  
> >  	if (rc != MIGRATEPAGE_SUCCESS) {
> > -		newpage->mapping = NULL;
> > +		if (!PageAnon(newpage))
> > +			newpage->mapping = NULL;
> 
> OK, I am probably washed out from looking into this for too long but I
> cannot figure why have you done this...

mem_cgroup_uncharge() relies on PageAnon() working.  Usually, anon
pages retain their page->mapping until they hit the page allocator,
the exception was old migration pages.

> >  	} else {
> > +		mem_cgroup_migrate(page, newpage, false);
> >  		if (remap_swapcache)
> >  			remove_migration_ptes(page, newpage);
> > -		page->mapping = NULL;
> > +		if (!PageAnon(page))
> > +			page->mapping = NULL;
> >  	}
> >  
> >  	unlock_page(newpage);
> 
> [...]
> 
> The semantic is much cleaner now. I have to digest details about the
> patch because it is really huge. But nothing really jumped at me during
> the review (except for few minor things mentioned here and one mentioned
> in other email regarding USED flag).
> 
> Good work! 

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]