On Mon, Sep 23, 2019 at 8:00 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > On Mon, Sep 23, 2019 at 07:50:15AM -0700, Alexander Duyck wrote: > > > > +static inline void > > > > +page_reporting_reset_boundary(struct zone *zone, unsigned int order, int mt) > > > > +{ > > > > + int index; > > > > + > > > > + if (order < PAGE_REPORTING_MIN_ORDER) > > > > + return; > > > > + if (!test_bit(ZONE_PAGE_REPORTING_ACTIVE, &zone->flags)) > > > > + return; > > > > + > > > > + index = get_reporting_index(order, mt); > > > > + reported_boundary[index] = &zone->free_area[order].free_list[mt]; > > > > +} > > > > > > So this seems to be costly. > > > I'm guessing it's the access to flags: > > > > > > > > > /* zone flags, see below */ > > > unsigned long flags; > > > > > > /* Primarily protects free_area */ > > > spinlock_t lock; > > > > > > > > > > > > which is in the same cache line as the lock. > > > > I'm not sure what you mean by this being costly? > > I've just been wondering why does will it scale report a 1.5% regression > with this patch. Are you talking about data you have collected from a test you have run, or the data I have run? In the case of the data I have run I notice almost no difference as long as the pages are not actually being madvised. Once I turn on the madvise then I start seeing the regression, but almost all of that is due to page zeroing/faulting. There isn't expected to be a gain from this patchset until you start having guests dealing with memory overcommit on the host. Then at that point the patch set should start showing gains when the madvise bits are enabled in QEMU. Also the test I have been running is a modified version of the page_fault1 test to specifically target transparent huge pages in order to make this test that much more difficult, the standard page_fault1 test wasn't showing much of anything since the overhead for breaking a 2M page into 512 4K pages and zeroing those individually in the guest was essentially drowning out the effect of the patches themselves. - Alex