Hi Reinette and Shuah, > On 11/2/2022 2:41 AM, Shuah Khan wrote: > > On 11/1/22 03:43, Shaopeng Tan wrote: > >> After creating a child process with fork() in CAT test, if there is > >> an error occurs or such as a SIGINT signal is received, the parent > >> process will be terminated immediately, but the child process will > >> not be killed and also umount_resctrlfs() will not be called. > >> > >> Add a signal handler like other tests to kill child process, umount > >> resctrlfs, cleanup result files, etc. when an error occurs. > >> > >> Signed-off-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxxxxx> > >> --- > >> tools/testing/selftests/resctrl/cat_test.c | 28 > >> +++++++++++++++------- > >> 1 file changed, 19 insertions(+), 9 deletions(-) > >> > >> diff --git a/tools/testing/selftests/resctrl/cat_test.c > >> b/tools/testing/selftests/resctrl/cat_test.c > >> index 6a8306b0a109..5f81817f4366 100644 > >> --- a/tools/testing/selftests/resctrl/cat_test.c > >> +++ b/tools/testing/selftests/resctrl/cat_test.c > >> @@ -98,12 +98,21 @@ void cat_test_cleanup(void) > >> remove(RESULT_FILE_NAME2); > >> } > >> +static void ctrl_handler(int signo) > >> +{ > >> + kill(bm_pid, SIGKILL); > >> + umount_resctrlfs(); > >> + tests_cleanup(); > >> + ksft_print_msg("Ending\n\n"); > > > > Is there a reason to print this message? Remove it unless it serves a > > purpose. > > This function appears to be a duplicate of existing resctrl_val.c:ctrlc_handler(). > Could the duplication be avoided instead of refining the copy? Yes, it's a duplicate of existing resctrl_val.c:ctrlc_handler(). I will try to use resctrl_val.c:ctrlc_handler() in next version patch series. > >> + > >> + exit(EXIT_SUCCESS); > >> +} > >> + > >> int cat_perf_miss_val(int cpu_no, int n, char *cache_type) > >> { > >> unsigned long l_mask, l_mask_1; > >> int ret, pipefd[2], sibling_cpu_no; > >> char pipe_message; > >> - pid_t bm_pid; > > > > Odd. bm_pid is used below - why remove it here? > > There is a global bm_pid in resctrl_val.c that is made available via extern in > resctrl.h. This is what causes this code to still compile but I would also like to > learn why moving to that is desired as a change here. I expect such a big > change to get a mention in the commit message. Yes. I used global bm_pid, since bm_pid is used in ctrl_handler() before this function cat_perf_miss_val(). I will add a description to commit message. > >> cache_size = 0; > >> @@ -181,17 +190,19 @@ int cat_perf_miss_val(int cpu_no, int n, char > >> *cache_type) > >> strcpy(param.filename, RESULT_FILE_NAME1); > >> param.num_of_runs = 0; > >> param.cpu_no = sibling_cpu_no; > >> + } else { > >> + /* set up ctrl-c handler */ > >> + if (signal(SIGINT, ctrl_handler) == SIG_ERR || > >> + signal(SIGHUP, ctrl_handler) == SIG_ERR || > >> + signal(SIGTERM, ctrl_handler) == SIG_ERR) > >> + printf("Failed to catch SIGNAL!\n"); > > > > Is perror() more appropriate here? > > Should we be using signal() at all? "man signal" reads: > "WARNING: the behavior of signal() varies across UNIX versions, and has also > varied historically across different versions of Linux. > Avoid its use: use sigaction(2) instead." > > "Failed to catch SIGNAL" also seems unclear to me. This is where a signal > handler is set up, the signal for which the handler is installed has not arrived. I will use sigaction(2) and perror(). > >> } > >> remove(param.filename); > >> ret = cat_val(¶m); > >> - if (ret) > >> - return ret; > >> - > >> - ret = check_results(¶m); > >> - if (ret) > >> - return ret; > >> + if (ret == 0) > >> + ret = check_results(¶m); > > > > Why not use a goto in error case to do umount_resctrlfs() instead of > > changing the conditionals? > > My understanding is the code that follows is needed to synchronize the exits > between the parent and child. It is the parent that will run umount_resctrlfs() > and it should only do so after the child is done. A goto by the parent may thus > cause > umount_resctrlfs() to be run while the child still relies on it while a goto by the > child may cause the parent not to receive the message that the child is > complete. Yes, the code that follows is needed to synchronize the exits between the parent and child. > > @@ -201,7 +212,6 @@ int cat_perf_miss_val(int cpu_no, int n, char > *cache_type) > > sizeof(pipe_message)) { > > close(pipefd[1]); > > perror("# failed signaling parent process"); > > - return errno; > > It looks like pipefd[1] will be closed twice if the write() failed. This close(pipefd[1]); should also be removed. > It does look strange to let the child continue to its infinite loop after the write() > failed. I assume that it is because the parent will also be stuck and the new > ctrl_handler() is expected to unblock both? If a SIGNAL is received, ctrl_handler() will be called to kill the child process and exit parent process. If no SIGNAL is received, the parent process will kill the child process. (by else{kill(bm_pid, SIGKILL);}) > Could you please add a comment to the code to clarify this flow? I will add comments here. Best regards, Shaopeng Tan