Re: [PATCH 2/2] selftests/migration: Disable NUMA balancing and check migration status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ryan Roberts <ryan.roberts@xxxxxxx> writes:

> On 07/08/2023 07:39, Alistair Popple wrote:
>> The migration selftest was only checking the return code and not the
>> status array for migration success/failure. Update the test to check
>> both. This uncovered a bug in the return code handling of
>> do_pages_move().
>> 
>> Also disable NUMA balancing as that can lead to unexpected migration
>> failures.
>> 
>> Signed-off-by: Alistair Popple <apopple@xxxxxxxxxx>
>> Suggested-by: Ryan Roberts <ryan.roberts@xxxxxxx>
>> ---
>> 
>> Ryan, this will still cause the test to fail if a migration failed. I
>> was unable to reproduce a migration failure for any cases on my system
>> once I disabled NUMA balancing though so I'd be curious if you are
>> still seeing failures with this patch applied. AFAIK there shouldn't
>> be anything else that would be causing migration failure so would like
>> to know what is causing failures. Thanks!
>
>
> Hi Alistair,
>
> Afraid I'm still seeing unmigrated pages when running with these 2 patches:
>
>
> #  RUN           migration.shared_anon ...
> Didn't migrate 1 pages
> # migration.c:183:shared_anon:Expected migrate(ptr, self->n1, self->n2) (-2) == 0 (0)
> # shared_anon: Test terminated by assertion
> #          FAIL  migration.shared_anon
> not ok 2 migration.shared_anon
>
>
> I added some instrumentation; it usually fails on the second time
> through the loop in migrate() but I've also seen it fail the first
> time. Never seen it get though 2 iterations successfully though.

Interesting. I guess migration failure is always possible for various
reasons so I will update the test to report the number of failed
migrations rather than making it a test failure. I was mostly just
curious as to what would be causing the occasional failures for my own
understanding, but the failures themselves are unimportant.

> I did also try just this patch without the error handling update in the kernel, but it still fails in the same way.
>
> I'm running on arm64 in case that wasn't clear. Let me know if there is anything I can do to help debug.

Thanks! Unless you're concerned about the failures I am happy to ignore
them. Pages can fail to migrate for all sorts of reasons although I'm a
little suprised anonymous migrations are failing so frequently for you.

> Thanks,
> Ryan
>
>
>> 
>>  tools/testing/selftests/mm/migration.c | 18 +++++++++++++++++-
>>  1 file changed, 17 insertions(+), 1 deletion(-)
>> 
>> diff --git a/tools/testing/selftests/mm/migration.c b/tools/testing/selftests/mm/migration.c
>> index 379581567f27..cf079af5799b 100644
>> --- a/tools/testing/selftests/mm/migration.c
>> +++ b/tools/testing/selftests/mm/migration.c
>> @@ -51,6 +51,12 @@ FIXTURE_SETUP(migration)
>>  	ASSERT_NE(self->threads, NULL);
>>  	self->pids = malloc(self->nthreads * sizeof(*self->pids));
>>  	ASSERT_NE(self->pids, NULL);
>> +
>> +	/*
>> +	 * Disable NUMA balancing which can cause migration
>> +	 * failures.
>> +	 */
>> +	numa_set_membind(numa_all_nodes_ptr);
>>  };
>>  
>>  FIXTURE_TEARDOWN(migration)
>> @@ -62,13 +68,14 @@ FIXTURE_TEARDOWN(migration)
>>  int migrate(uint64_t *ptr, int n1, int n2)
>>  {
>>  	int ret, tmp;
>> -	int status = 0;
>>  	struct timespec ts1, ts2;
>>  
>>  	if (clock_gettime(CLOCK_MONOTONIC, &ts1))
>>  		return -1;
>>  
>>  	while (1) {
>> +		int status = NUMA_NUM_NODES + 1;
>> +
>>  		if (clock_gettime(CLOCK_MONOTONIC, &ts2))
>>  			return -1;
>>  
>> @@ -85,6 +92,15 @@ int migrate(uint64_t *ptr, int n1, int n2)
>>  			return -2;
>>  		}
>>  
>> +		/*
>> +		 * Note we should never see this because move_pages() should
>> +		 * have indicated a page couldn't migrate above.
>> +		 */
>> +		if (status < 0) {
>> +			printf("Page didn't migrate, error %d\n", status);
>> +			return -2;
>> +		}
>> +
>>  		tmp = n2;
>>  		n2 = n1;
>>  		n1 = tmp;





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux