Re: [PATCH v11 0/6] mm / virtio: Provide support for unused page reporting

Alexander Duyck <alexander.duyck@xxxxxxxxx> · Wed, 2 Oct 2019 07:25:50 -0700

On Wed, Oct 2, 2019 at 3:37 AM Nitesh Narayan Lal <nitesh@xxxxxxxxxx> wrote:
>
>
> On 10/1/19 8:55 PM, Alexander Duyck wrote:
> > On Tue, Oct 1, 2019 at 12:16 PM Nitesh Narayan Lal <nitesh@xxxxxxxxxx> wrote:
> >>
> >> On 10/1/19 12:21 PM, Alexander Duyck wrote:
> >>> On Tue, 2019-10-01 at 17:35 +0200, David Hildenbrand wrote:
> >>>> On 01.10.19 17:29, Alexander Duyck wrote:

<snip>

> >>> Do we have a working patch set for Nitesh's code? The last time I tried
> >>> running his patch set I ran into issues with kernel panics. If we have a
> >>> known working/stable patch set I can give it a try.
> >> Did you try the v12 patch-set [1]?
> >> I remember that you reported the CPU stall issue, which I fixed in the v12.
> >>
> >> [1] https://lkml.org/lkml/2019/8/12/593
> > So I tried testing with the spin_lock calls replaced with spin_lock
> > _irq to resolve the IRQ issue. I also had shuffle enabled in order to
> > increase the number of pages being dirtied.
> >
> > With that setup the bitmap approach is running significantly worse
> > then my approach, even with the boundary removed. Since I had to
> > modify the code to even getting working I am not comfortable posting
> > numbers.
>
> I didn't face any issue in getting the code work or compile.
> Before my v12 posting, I did try your previously suggested test
> (will-it-scale/page_fault1 for 12 hours on a 60 GB) and didn't see any issues.
> I think it would help more if you can share the setup which you are running.

So one issue with the standard page_fault1 is that it is only
operating at the 4K page level. You won't see much impact from you
patches with that as the overhead of splitting a MAX_ORDER - 2 page
down to a 4K page will end up being the biggest thing you are
benchmarking.

I think I have brought it up before but I am running with the
page_fault1 modified to use THP. Making the change is pretty
straightforward as  all you have to do is add an madvise to the test
code. All that is needed is to add "madvise(c, MEMSIZE,
MADV_HUGEPAGE);" between the assert and the for loop in the
page_fault1 code and then rebuild the test. I actually copied
page_fault1.c into a file I named page_fault4.c and added the line. As
a result it seems like the code will build it as an additional test.

The only other alteration I can think of that might have much impact
would be to enable the page shuffling. The idea is that it will cause
us to use more pages because half of the pages freed are dumped to the
tail of the list so we are constantly churning the memory.

> > My suggestion would be to look at reworking the patch set and
> > post numbers for my patch set versus the bitmap approach and we can
> > look at them then.
>
> Agreed. However, in order to fix an issue I have to reproduce it first.

With the tweak I have suggested above it should make it much easier to
reproduce. Basically all you need is to have the allocation competing
against hinting. Currently the hinting isn't doing this because the
allocations are mostly coming out of 4K pages instead of higher order
ones.

Alternatively you could just make the suggestion I had proposed about
using spin_lock/unlock_irq in your worker thread and that resolved it
for me.

> >  I would prefer not to spend my time fixing and
> > tuning a patch set that I am still not convinced is viable.
>
> You  don't have to, I can fix the issues in my patch-set. :)

Sounds good. Hopefully the stuff I pointed out above helps you to get
a reproduction and resolve the issues.

- Alex