On Tue, 2024-12-24 at 11:20 -0500, Mathieu Desnoyers wrote: > On 2024-12-16 08:09, Gabriele Monaco wrote: > > A task in the kernel (task_mm_cid_work) runs somewhat periodically > > to > > compact the mm_cid for each process, this test tries to validate > > that > > it runs correctly and timely. > > > > + if (curr_mm_cid == 0) { > > + printf_verbose( > > + "mm_cids successfully compacted, exiting\n"); > > + pthread_exit(NULL); > > + } > > + usleep(RUNNER_PERIOD); > > + } > > + assert(false); > > I suspect we'd want an explicit error message here > with an abort() rather than an assertion which can be > compiled-out with -DNDEBUG. > > > + } > > + printf_verbose("cpu%d has %d and is going to terminate\n", > > + sched_getcpu(), curr_mm_cid); > > + pthread_exit(NULL); > > +} > > + > > +void test_mm_cid_compaction(void) > > This function should return its error to the caller > rather than assert. > > > +{ > > + cpu_set_t affinity; > > + int i, j, ret, num_threads; > > + pthread_t *tinfo; > > + pthread_mutex_t *token; > > + struct thread_args *args; > > + > > + sched_getaffinity(0, sizeof(affinity), &affinity); > > + num_threads = CPU_COUNT(&affinity); > > + tinfo = calloc(num_threads, sizeof(*tinfo)); > > + if (!tinfo) { > > + fprintf(stderr, "Error: failed to allocate tinfo(%d): %s\n", > > + errno, strerror(errno)); > > + assert(ret == 0); > > + } > > + args = calloc(num_threads, sizeof(*args)); > > + if (!args) { > > + fprintf(stderr, "Error: failed to allocate args(%d): %s\n", > > + errno, strerror(errno)); > > + assert(ret == 0); > > + } > > + token = calloc(num_threads, sizeof(*token)); > > + if (!token) { > > + fprintf(stderr, "Error: failed to allocate token(%d): %s\n", > > + errno, strerror(errno)); > > + assert(ret == 0); > > + } > > + if (num_threads == 1) { > > + printf_verbose( > > + "Running on a single cpu, cannot test anything\n"); > > + return; > > This should return a value telling the caller that > the test is skipped (not an error per se). > Thanks for the review! I'm not sure how to properly handle these, but it seems to me the cleanest way is to use ksft_* functions to report failures and skipped tests. Other tests in rseq don't use the library but it doesn't seem a big deal if just one test is using it, for now. It gets a bit complicated to return values since we are exiting from the main thread (sure we could join the remaining /winning/ thread but we would end up with 2 threads running). The ksft_* functions solve this quite nicely using exit codes, though. > > + } > > + pthread_mutex_init(token, NULL); > > + /* The main thread runs on CPU0 */ > > + for (i = 0, j = 0; i < CPU_SETSIZE && j < num_threads; i++) { > > + if (CPU_ISSET(i, &affinity)) { > > We can save an indent level here by moving this > in the for () condition: > > for (i = 0, j = 0; i < CPU_SETSIZE && > CPU_ISSET(i, &affinity) && j < num_threads; i++) { > Well, if we assume the affinity mask is contiguous, which is likely but not always true. A typical setup with isolated CPUs have one housekeeping core per NUMA node, let's say 0,32,64,96 out of 127 cpus, the test would run only on cpu 0 in that case. > > + args[j].num_cpus = num_threads; > > + args[j].tinfo = tinfo; > > + args[j].token = token; > > + args[j].cpu = i; > > + args[j].args_head = args; > > + if (!j) { > > + /* The first thread is the main one */ > > + tinfo[0] = pthread_self(); > > + ++j; > > + continue; > > + } > > + ret = pthread_create(&tinfo[j], NULL, thread_runner, > > + &args[j]); > > + if (ret) { > > + fprintf(stderr, > > + "Error: failed to create thread(%d): %s\n", > > + ret, strerror(ret)); > > + assert(ret == 0); > > + } > > + ++j; > > + } > > + } > > + printf_verbose("Started %d threads\n", num_threads); > > I think there is a missing rendez-vous point here. Assuming a > sufficiently long unexpected delay (think of a guest VM VCPU > preempted for a long time), the new leader can start poking > into args and other thread's info while we are still creating > threads here. > Yeah, good point, I'm assuming all threads are ready by the time we are done waiting but that's not bulletproof. I'll add a barrier. Thanks again for the comments, I'll prepare a V4. Gabriele