On Tue, 11 Jan 2011, Daisuke Nishimura wrote: > > What I recommend is below. (Please see the newest -mm because of a bug fix for > > mem cgroup) Considering page management on radix-tree, it can be considerd as > > a kind of page-migration, which replaces pages on radix-tree. > > > > == > > > > > +int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask) > > > +{ > > > + int error; > > > + > > > + VM_BUG_ON(!PageLocked(old)); > > > + VM_BUG_ON(!PageLocked(new)); > > > + VM_BUG_ON(new->mapping); > > > + > > struct mem_cgroup *memcg; > > > I think it should be initialized to NULL. > > > error = mem_cgroup_prepare_migration(old, new, &memcg); > > I want some comments like: > > /* > * This is not page migration, but prepare_migration and end_migration > * does enough work for charge replacement. > */ > > > # > > # This function will charge against "newpage". But this expects > > # the caller allows GFP_KERNEL gfp_mask. > > # After this, the newpage is in "charged" state. > > if (error) > > return -ENOMEM; > > > > > + error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM); > > > + if (!error) { > > > + struct address_space *mapping = old->mapping; > > > + pgoff_t offset = old->index; > > > + > > > + page_cache_get(new); > > > + new->mapping = mapping; > > > + new->index = offset; > > > + > > > + spin_lock_irq(&mapping->tree_lock); > > > + __remove_from_page_cache(old); > > > + error = radix_tree_insert(&mapping->page_tree, offset, new); > > > + BUG_ON(error); > > > + mapping->nrpages++; > > > + __inc_zone_page_state(new, NR_FILE_PAGES); > > > + if (PageSwapBacked(new)) > > > + __inc_zone_page_state(new, NR_SHMEM); > > > + spin_unlock_irq(&mapping->tree_lock); > > > + radix_tree_preload_end(); > > > > > + mem_cgroup_replace_cache_page(old, new); <== remove this. > > > > mem_cgroup_end_migraton(memcg, old, new, true); > > > > > + page_cache_release(old); > > > + } > > else > > mem_cgroup_end_migration(memcg, old, new, false); > > > > # Here, if the 4th argument is true, old page is uncharged. > > # if the 4th argument is false, the new page is uncharged. > > # Then, "charge" of the old page will be migrated onto the new page > > # if replacement is done. > > > > > > > > > + > > > + return error; > > > +} > > > +EXPORT_SYMBOL_GPL(replace_page_cache_page); > > > + > > > > == > > > > I think this is enough simple and this covers all memory cgroup's racy > > problems. > > > I agree. Thanks for the comments. Yeah, using existing infrastructure is undoubtedly simpler and less prone to bugs. So going with this for a first implementation might do. However, replace_page_cache_page() is meant to be very efficient, otherwise any performance won by not copying the page contents are lost to the cost of page replacement. My guess is, mem_cgroup_prepare_migration()/end_migration() are to heavyweight for this. Thanks, Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html