Re: [PATCH] mm: add replace_page_cache_page() function

Miklos Szeredi <miklos@xxxxxxxxxx> · Tue, 11 Jan 2011 10:38:16 +0100

On Tue, 11 Jan 2011, Daisuke Nishimura wrote:
> > What I recommend is below. (Please see the newest -mm because of a bug fix for
> > mem cgroup) Considering page management on radix-tree, it can be considerd as
> > a kind of page-migration, which replaces pages on radix-tree.
> > 
> > ==
> > 
> > > +int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask)
> > > +{
> > > +	int error;
> > > +
> > > +	VM_BUG_ON(!PageLocked(old));
> > > +	VM_BUG_ON(!PageLocked(new));
> > > +	VM_BUG_ON(new->mapping);
> > > +
> > 	struct mem_cgroup *memcg;
> > 
> I think it should be initialized to NULL.
> 
> > 	error = mem_cgroup_prepare_migration(old, new, &memcg);
> 
> I want some comments like:
> 
> 	/*
> 	 * This is not page migration, but prepare_migration and end_migration
> 	 * does enough work for charge replacement.
> 	 */
> 
> > 	#
> > 	# This function will charge against "newpage". But this expects
> > 	# the caller allows GFP_KERNEL gfp_mask. 
> > 	# After this, the newpage is in "charged" state.
> > 	if (error)
> > 		return -ENOMEM;
> > 
> > > +	error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
> > > +	if (!error) {
> > > +		struct address_space *mapping = old->mapping;
> > > +		pgoff_t offset = old->index;
> > > +
> > > +		page_cache_get(new);
> > > +		new->mapping = mapping;
> > > +		new->index = offset;
> > > +
> > > +		spin_lock_irq(&mapping->tree_lock);
> > > +		__remove_from_page_cache(old);
> > > +		error = radix_tree_insert(&mapping->page_tree, offset, new);
> > > +		BUG_ON(error);
> > > +		mapping->nrpages++;
> > > +		__inc_zone_page_state(new, NR_FILE_PAGES);
> > > +		if (PageSwapBacked(new))
> > > +			__inc_zone_page_state(new, NR_SHMEM);
> > > +		spin_unlock_irq(&mapping->tree_lock);
> > > +		radix_tree_preload_end();
> > 
> > > +		mem_cgroup_replace_cache_page(old, new); <== remove this.
> > 
> > 		mem_cgroup_end_migraton(memcg, old, new, true);
> > 
> > > +		page_cache_release(old);
> > > +	} 
> > 	else 
> > 		mem_cgroup_end_migration(memcg, old, new, false);
> > 
> > 	# Here, if the 4th argument is true, old page is uncharged.
> > 	# if the 4th argument is false, the new page is uncharged.
> > 	# Then, "charge" of the old page will be migrated onto the new page
> > 	# if replacement is done.
> > 
> > 
> > 
> > > +
> > > +	return error;
> > > +}
> > > +EXPORT_SYMBOL_GPL(replace_page_cache_page);
> > > +
> > 
> > ==
> > 
> > I think this is enough simple and this covers all memory cgroup's racy
> > problems.
> > 
> I agree.

Thanks for the comments.

Yeah, using existing infrastructure is undoubtedly simpler and less
prone to bugs.  So going with this for a first implementation might
do.

However, replace_page_cache_page() is meant to be very efficient,
otherwise any performance won by not copying the page contents are
lost to the cost of page replacement.  My guess is,
mem_cgroup_prepare_migration()/end_migration() are to heavyweight for
this.

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html