On Fri, Aug 20, 2021, Mathieu Desnoyers wrote: > I still really hate flakiness in tests, because then people stop caring when they > fail once in a while. And with the nature of rseq, a once-in-a-while failure is a > big deal. Let's see if we can use other tricks to ensure stability of the cpu id > without changing timings too much. Yeah, zero agrument regarding flaky tests. > One idea would be to use a seqcount lock. A sequence counter did the trick! Thanks much! > But even if we use that, I'm concerned that the very long writer critical > section calling sched_setaffinity would need to be alternated with a sleep to > ensure the read-side progresses. The sleep delay could be relatively small > compared to the duration of the sched_setaffinity call, e.g. ratio 1:10. I already had an arbitrary usleep(10) to let the reader make progress between sched_setaffinity() calls. Dropping it down to 1us didn't affect reproducibility, so I went with that to shave those precious cycles :-) Eliminating the delay entirely did result in no repro, which was a nice confirmation that it's needed to let the reader get back into KVM_RUN. Thanks again!