Re: mm: memcontrol: rewrite uncharge API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 07-08-14 12:25:07, Johannes Weiner wrote:
> On Thu, Aug 07, 2014 at 09:38:26AM +0200, Michal Hocko wrote:
> > On Wed 06-08-14 14:02:35, Andrew Morton wrote:
> > > On Wed, 6 Aug 2014 14:00:55 -0700 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > 
> > > > From: Johannes Weiner <hannes@xxxxxxxxxxx>
> > > > Subject: mm: memcontrol: rewrite uncharge API
> > > > 
> > > 
> > > Nope, sorry, that was missing
> > > mm-memcontrol-rewrite-uncharge-api-fix-clear-page-mapping-in-migration.patch.
> > > 
> > > This time:
> > > 
> > > From: Johannes Weiner <hannes@xxxxxxxxxxx>
> > > Subject: mm: memcontrol: rewrite uncharge API
> > > 
> > > The memcg uncharging code that is involved towards the end of a page's
> > > lifetime - truncation, reclaim, swapout, migration - is impressively
> > > complicated and fragile.
> > > 
> > > Because anonymous and file pages were always charged before they had their
> > > page->mapping established, uncharges had to happen when the page type
> > > could still be known from the context; as in unmap for anonymous, page
> > > cache removal for file and shmem pages, and swap cache truncation for swap
> > > pages.  However, these operations happen well before the page is actually
> > > freed, and so a lot of synchronization is necessary:
> > > 
> > > - Charging, uncharging, page migration, and charge migration all need
> > >   to take a per-page bit spinlock as they could race with uncharging.
> > > 
> > > - Swap cache truncation happens during both swap-in and swap-out, and
> > >   possibly repeatedly before the page is actually freed.  This means
> > >   that the memcg swapout code is called from many contexts that make
> > >   no sense and it has to figure out the direction from page state to
> > >   make sure memory and memory+swap are always correctly charged.
> > > 
> > > - On page migration, the old page might be unmapped but then reused,
> > >   so memcg code has to prevent untimely uncharging in that case.
> > >   Because this code - which should be a simple charge transfer - is so
> > >   special-cased, it is not reusable for replace_page_cache().
> > > 
> > > But now that charged pages always have a page->mapping, introduce
> > > mem_cgroup_uncharge(), which is called after the final put_page(), when we
> > > know for sure that nobody is looking at the page anymore.
> > > 
> > > For page migration, introduce mem_cgroup_migrate(), which is called after
> > > the migration is successful and the new page is fully rmapped.  Because
> > > the old page is no longer uncharged after migration, prevent double
> > > charges by decoupling the page's memcg association (PCG_USED and
> > > pc->mem_cgroup) from the page holding an actual charge.  The new bits
> > > PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
> > > to the new page during migration.
> > > 
> > > mem_cgroup_migrate() is suitable for replace_page_cache() as well, which
> > > gets rid of mem_cgroup_replace_page_cache().
> > > 
> > > Swap accounting is massively simplified: because the page is no longer
> > > uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
> > > transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
> > > before the final put_page() in page reclaim.
> > > 
> > > Finally, page_cgroup changes are now protected by whatever protection the
> > > page itself offers: anonymous pages are charged under the page table lock,
> > > whereas page cache insertions, swapin, and migration hold the page lock. 
> > > Uncharging happens under full exclusion with no outstanding references. 
> > > Charging and uncharging also ensure that the page is off-LRU, which
> > > serializes against charge migration.  Remove the very costly page_cgroup
> > > lock and set pc->flags non-atomically.
> > 
> > I see some point in squashing all the fixups into the single patch but I
> > am afraid we have lost some interesting details from fix ups this time.
> > I think that at least
> > mm-memcontrol-rewrite-uncharge-api-fix-page-cache-migration.patch and
> > mm-memcontrol-rewrite-uncharge-api-fix-page-cache-migration-2.patch
> > would be good to go on their own _or_ their changelogs added here. The
> > whole page cache replace path is obscure and we should rather have that
> > documented so we do not have to google for details or go through painful
> > code inspection next time.
> 
> I agree, we would lose something there.  There is a paragraph in the
> changelog that says:
> 
> mem_cgroup_migrate() is suitable for replace_page_cache() as well,
> which gets rid of mem_cgroup_replace_page_cache().
> 
> Could you please update it to say:
> 
> mem_cgroup_migrate() is suitable for replace_page_cache() as well,
> which gets rid of mem_cgroup_replace_page_cache().  However, care
> needs to be taken because both the source and the target page can
> already be charged and on the LRU when fuse is splicing: grab the page
> lock on the charge moving side to prevent changing pc->mem_cgroup of a
> page under migration.  Also, the lruvecs of both pages change as we
> uncharge the old and charge the new during migration, and putback may
> race with us, so grab the lru lock and isolate the pages iff on LRU to
> prevent races and ensure the pages are on the right lruvec afterward.

Thanks! This is much better.

> > > [vdavydov@xxxxxxxxxxxxx: fix flags definition]
> > > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> > > Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> > > Cc: Tejun Heo <tj@xxxxxxxxxx>
> > > Cc: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx>
> > > Tested-by: Jet Chen <jet.chen@xxxxxxxxx>
> > > Signed-off-by: Michal Hocko <mhocko@xxxxxxx>
> > 
> > If this comes from the above change then is should be probably removed.
> > You can replace it by my Acked-by. I have acked all the follow up fixes
> > but forgot to ack the initial patch.
> 
> Thanks!
> 
> > > Tested-by: Felipe Balbi <balbi@xxxxxx>
> > 
> > this tested-by came from the same preempt_{en,dis}able patch AFAICS.
> 
> Yeah it might be a bit overreaching to apply this to the full change.
> 
> On a different note, Michal, I just scrolled through the 2000 lines
> that follow to see if you had any more comments, but there was only
> your signature at the bottom.  Please think about the quote context
> after you inserted your inline comments and then trim accordingly.

Sure I usually trim emails a lot. Forgot this time, sorry about that!
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]