Re: [PATCH] mm/vmstat: Reduce zone lock hold time when reading /proc/pagetypeinfo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 22 Oct 2019, Waiman Long wrote:

> >>> and used nr_free to compute the missing count. Since MIGRATE_MOVABLE
> >>> is usually the largest one on large memory systems, this is the one
> >>> to be skipped. Since the printing order is migration-type => order, we
> >>> will have to store the counts in an internal 2D array before printing
> >>> them out.
> >>>
> >>> Even by skipping the MIGRATE_MOVABLE pages, we may still be holding the
> >>> zone lock for too long blocking out other zone lock waiters from being
> >>> run. This can be problematic for systems with large amount of memory.
> >>> So a check is added to temporarily release the lock and reschedule if
> >>> more than 64k of list entries have been iterated for each order. With
> >>> a MAX_ORDER of 11, the worst case will be iterating about 700k of list
> >>> entries before releasing the lock.
> >> But you are still iterating through the whole free_list at once so if it
> >> gets really large then this is still possible. I think it would be
> >> preferable to use per migratetype nr_free if it doesn't cause any
> >> regressions.
> >>
> > Yes, it is still theoretically possible. I will take a further look at
> > having per-migrate type nr_free. BTW, there is one more place where the
> > free lists are being iterated with zone lock held - mark_free_pages().
> 
> Looking deeper into the code, the exact migration type is not stored in
> the page itself. An initial movable page can be stolen to be put into
> another migration type. So in a delete or move from free_area, we don't
> know exactly what migration type the page is coming from. IOW, it is
> hard to get accurate counts of the number of entries in each lists.
> 

I think the suggestion is to maintain a nr_free count of the free_list for 
each order for each migratetype so anytime a page is added or deleted from 
the list, the nr_free is adjusted.  Then the free_area's nr_free becomes 
the sum of its migratetype's nr_free at that order.  That's possible to do 
if you track the migratetype per page, as you said, or like pcp pages 
track it as part of page->index.  It's a trade-off on whether you want to 
impact the performance of maintaining these new nr_frees anytime you 
manipulate the freelists.

I think Vlastimil and I discussed per order per migratetype nr_frees in 
the past and it could be a worthwhile improvement for other reasons, 
specifically it leads to heuristics that can be used to determine how 
fragmentated a certain migratetype is for a zone, i.e. a very quick way to 
determine what ratio of pages over all MIGRATE_UNMOVABLE pageblocks are 
free.

Or maybe there are other reasons why these nr_frees can't be maintained 
anymore?  (I had a patch to do it on 4.3.)

You may also find systems where MIGRATE_MOVABLE is not actually the 
longest free_list compared to other migratetypes on a severely fragmented 
system, so special casing MIGRATE_MOVABLE might not be the best way 
forward.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux