Re: [PATCH v13 4/5] mm: support reporting free page blocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 03-08-17 18:42:15, Wei Wang wrote:
> On 08/03/2017 05:11 PM, Michal Hocko wrote:
> >On Thu 03-08-17 14:38:18, Wei Wang wrote:
[...]
> >>+static int report_free_page_block(struct zone *zone, unsigned int order,
> >>+				  unsigned int migratetype, struct page **page)
> >This is just too ugly and wrong actually. Never provide struct page
> >pointers outside of the zone->lock. What I've had in mind was to simply
> >walk free lists of the suitable order and call the callback for each one.
> >Something as simple as
> >
> >	for (i = 0; i < MAX_NR_ZONES; i++) {
> >		struct zone *zone = &pgdat->node_zones[i];
> >
> >		if (!populated_zone(zone))
> >			continue;
> >		spin_lock_irqsave(&zone->lock, flags);
> >		for (order = min_order; order < MAX_ORDER; ++order) {
> >			struct free_area *free_area = &zone->free_area[order];
> >			enum migratetype mt;
> >			struct page *page;
> >
> >			if (!free_area->nr_pages)
> >				continue;
> >
> >			for_each_migratetype_order(order, mt) {
> >				list_for_each_entry(page,
> >						&free_area->free_list[mt], lru) {
> >
> >					pfn = page_to_pfn(page);
> >					visit(opaque2, prn, 1<<order);
> >				}
> >			}
> >		}
> >
> >		spin_unlock_irqrestore(&zone->lock, flags);
> >	}
> >
> >[...]
> 
> 
> I think the above would take the lock for too long time. That's why we
> prefer to take one free page block each time, and taking it one by one
> also doesn't make a difference, in terms of the performance that we
> need.

I think you should start with simple approach and impove incrementally
if this turns out to be not optimal. I really detest taking struct pages
outside of the lock. You never know what might happen after the lock is
dropped. E.g. can you race with the memory hotremove?

> The struct page is used as a "state" to get the next free page block. It is
> only
> given for an internal implementation of a function in mm ( not seen by the
> outside caller). Would this be OK?
> If not, how about pfn - we can also pass in pfn to the function, and do
> pfn_to_page each time the function starts, and then do page_to_pfn when
> returns.

No, just do not try to play tricks with struct pages which might have
gone away.
-- 
Michal Hocko
SUSE Labs



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux