On Thu, 8 Sep 2011 16:52:22 -0700 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 18 Jan 2011 15:28:44 -0800 > Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > > On Tue, 18 Jan 2011 12:18:11 +0100 > > Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > > > > +int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask) > > > +{ > > > + int error; > > > + struct mem_cgroup *memcg = NULL; > > > > I'm suspecting that the unneeded initialisation was added to suppress a > > warning? > > > > I removed it, and didn't get a warning. I expected to. > > > > Really, uninitialized_var() is better. It avoids adding extra code > > and, unlike "= 0" it is self-documenting. > > > > > + VM_BUG_ON(!PageLocked(old)); > > > + VM_BUG_ON(!PageLocked(new)); > > > + VM_BUG_ON(new->mapping); > > > + > > > + /* > > > + * This is not page migration, but prepare_migration and > > > + * end_migration does enough work for charge replacement. > > > + * > > > + * In the longer term we probably want a specialized function > > > + * for moving the charge from old to new in a more efficient > > > + * manner. > > > + */ > > > + error = mem_cgroup_prepare_migration(old, new, &memcg, gfp_mask); > > > + if (error) > > > + return error; > > > + > > > + error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM); > > > + if (!error) { > > > + struct address_space *mapping = old->mapping; > > > + pgoff_t offset = old->index; > > > + > > > + page_cache_get(new); > > > + new->mapping = mapping; > > > + new->index = offset; > > > + > > > + spin_lock_irq(&mapping->tree_lock); > > > + __remove_from_page_cache(old); > > > + error = radix_tree_insert(&mapping->page_tree, offset, new); > > > + BUG_ON(error); > > > + mapping->nrpages++; > > > + __inc_zone_page_state(new, NR_FILE_PAGES); > > > + if (PageSwapBacked(new)) > > > + __inc_zone_page_state(new, NR_SHMEM); > > > + spin_unlock_irq(&mapping->tree_lock); > > > + radix_tree_preload_end(); > > > + page_cache_release(old); > > > + mem_cgroup_end_migration(memcg, old, new, true); > > > > This is all pretty ugly and inefficient. > > > > We call __remove_from_page_cache() which does a radix-tree lookup and > > then fiddles a bunch of accounting things. > > > > Then we immediately do the same radix-tree lookup and then undo the > > accounting changes which we just did. And we do it in an open-coded > > fashion, thus giving the kernel yet another code site where various > > operations need to be kept in sync. > > > > Would it not be better to do a single radix_tree_lookup_slot(), > > overwrite the pointer therein and just leave all the ancilliary > > accounting unaltered? > > > > Poke? Sorry, I didn't read this mail. The codes around __remove_from_page_cache and radix_tree_insert, I agree you. About counters, the page may be in different zone and related statistics should be changed. About memcg, this function does page replacement. Then, information in old page_cgroup should be moved to the new page_cgroup. So, I advised to use migration code which is used in many situation(now) rather than adding new something strange. Hmm, in quick thinking, we can reuse migration function core rather than using this new one ? Hmm..but page_count() check may fail.... Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html