On Thu 2025-01-16 13:03:16, laokz wrote: > Hi Petr, > > Thanks for the quick reply. > > On 1/15/2025 11:57 PM, Petr Mladek wrote: > > On Wed 2025-01-15 08:32:12, laokz@xxxxxxxxxxx wrote: > > > When do livepatch transition, kernel call klp_try_complete_transition() which in-turn might call klp_send_signals(). klp_send_signal() has the code: > > > > > > if (klp_signals_cnt == SIGNALS_TIMEOUT) > > > pr_notice("signaling remaining tasks\n"); > > > > > > Do we need to match or filter out this message when check_result? > > > And here klp_signals_cnt MUST EQUAL to SIGNALS_TIMEOUT, right? > > Oops, I misunderstood the 2nd question: (klp_signals_cnt % SIGNALS_TIMEOUT > == 0) not always mean equal. > > > Good question. Have you seen this message when running the selftests, please? > > > > I wonder which test could trigger it. I do not recall any test > > livepatch where the transition might get blocked for too long. > > > > There is the self test with a blocked transition ("busy target > > module") but the waiting is stopped much earlier there. > > > > The message probably might get printed when the selftests are > > called on a huge and very busy system. But then we might get > > into troubles also with other timeouts. So it would be nice > > to know more details about when this happens. > > We're trying to port livepatch to RISC-V. In my qemu virt VM in a cloud > environment, all tests passed except test-syscall.sh. Mostly it complained > the missed dmesg "signaling remaining tasks". I want to confirm from your > experts that in theory the failure is expected, or if we could filter out > this potential dmesg completely. The test-syscall.sh test spawns many processes which are calling the SYS_getpid syscall in a busy loop. I could imagine that it might cause problems when the virt VM emulates much more virtual CPUs than the assigned real CPUs. It might be even worse when the RISC-V processor is just emulated on another architecture. Anyway, we have already limited the max number of processes because they overflow the default log buffer size, see the commit 46edf5d7aed54380 ("selftests/livepatch: define max test-syscall processes"). Does it help to reduce the MAXPROC limit from 128 to 64, 32, or 16? IMHO, even 16 processes are good enough. We do not need to waste that many resources by QA. You might also review the setup of your VM and reduce the number of emulated CPUs. If the VM is not able to reasonably handle high load than it might show false positives in many tests. If nothing helps, fell free to send a patch for filtering the "signaling remaining tasks" message. IMHO, it is perfectly fine to hide this message. Just extend the already existing filter in the "check_result" function. Best Regards, Petr