Re: [PATCH net-next v12 01/14] mm: page_frag: add a test module for page_frag

Yunsheng Lin <linyunsheng@xxxxxxxxxx> · Thu, 1 Aug 2024 20:58:36 +0800

On 2024/8/1 2:29, Alexander Duyck wrote:
> On Wed, Jul 31, 2024 at 5:50 AM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote:
>>
>> Basing on the lib/objpool.c, change it to something like a
>> ptrpool, so that we can utilize that to test the correctness
>> and performance of the page_frag.
>>
>> The testing is done by ensuring that the fragment allocated
>> from a frag_frag_cache instance is pushed into a ptrpool
>> instance in a kthread binded to a specified cpu, and a kthread
>> binded to a specified cpu will pop the fragment from the
>> ptrpool and free the fragment.
>>
>> We may refactor out the common part between objpool and ptrpool
>> if this ptrpool thing turns out to be helpful for other place.
> 
> This isn't a patch where you should be introducing stuff you hope to
> refactor out and reuse later. Your objpoo/ptrpool stuff is just going
> to add bloat and overhead as you are going to have to do pointer
> changes to get them in and out of memory and you are having to scan
> per-cpu lists. You would be better served using a simple array as your
> threads should be stick to a consistent CPU anyway in terms of
> testing.
> 
> I would suggest keeping this much more simple. Trying to pattern this
> after something like the dmapool_test code would be a better way to go
> for this. We don't need all this extra objpool overhead getting in the
> way of testing the code you should be focused on. Just allocate your
> array on one specific CPU and start placing and removing your pages
> from there instead of messing with the push/pop semantics.

I am not sure if I understand what you meant here, do you meant something
like dmapool_test_alloc() does as something like below?

static int page_frag_test_alloc(void **p, int blocks)
{
	int i;

	for (i = 0; i < blocks; i++) {
		p[i] = page_frag_alloc(&test_frag, test_alloc_len, GFP_KERNEL);

		if (!p[i])
			goto pool_fail;
	}

	for (i = 0; i < blocks; i++)
		page_frag_free(p[i]);

	....
}

The above was my initial thinking too, I went to the ptrpool thing using
at least two CPUs as the below reason:
1. Test the concurrent calling between allocing and freeing more throughly,
   for example, page->_refcount concurrent handling, cache draining and
   cache reusing code path will be tested more throughly.
2. Test the performance impact of cache bouncing between different CPUs.

I am not sure if there is a more lightweight implementation than ptrpool
to do the above testing more throughly.

> 
> Lastly something that is a module only tester that always fails to
> probe doesn't sound like it really makes sense as a standard kernel

I had the same feeling as you, but when doing testing, it seems
convenient enough to do a 'insmod xxx.ko' for testing without a
'rmmod xxx.ko'

> module. I still think it would make more sense to move it to the
> selftests tree and just have it build there as a module instead of

I failed to find one example of test kernel module that is in the
selftests tree yet. If it does make sense, please provide an example
here, and I am willing to follow the pattern if there is one.

> trying to force it into the mm tree. The example of dmapool_test makes
> sense as it could be run at early boot to run the test and then it

I suppose you meant dmapool is built-in to the kernel and run at early
boot? I am not sure what is the point of built-in for dmapool, as it
only do one-time testing, and built-in for dmapool only waste some
memory when testing is done.

> just goes quiet. This module won't load and will always just return
> -EAGAIN which doesn't sound like a valid kernel module to me.

As above, it seems convenient enough to do a 'insmod xxx.ko' for testing
without a 'rmmod xxx.ko'.