On 04/30/2018 12:00 PM, Oleg Nesterov wrote: > On 04/30, Andrey Grodzovsky wrote: >> What about changing PF_SIGNALED to PF_EXITING in >> drm_sched_entity_do_release >> >> -      if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL) >> +     if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL) > let me repeat, please don't use task->exit_code. And in fact this check is racy > > But this doesn't matter. Say, we can trivially add SIGNAL_GROUP_KILLED_BY_SIGKILL, > or do something else, Can you explain where is the race and what is a possible alternative then ? > but I fail to understand what are you trying to do. Suppose > that the check above is correct in that it is true iff the task is exiting and > it was killed by SIGKILL. What about the "else" branch which does > > r = wait_event_killable(sched->job_scheduled, ...) > > ? > > Once again, fatal_signal_pending() (or even signal_pending()) is not well defined > after the exiting task passes exit_signals(). > > So wait_event_killable() can fail because fatal_signal_pending() is true; and this > can happen even if it was not killed. > > Or it can block and SIGKILL won't be able to wake it up. > >> If SIGINT was sent then it's SIGINT, > Yes, but see above. in this case fatal_signal_pending() will be likely true so > wait_event_killable() will fail unless condition is already true. My bad, I didn't show the full intended fix, it was just a snippet to address the differentiation between exiting do to SIGKILL and any other exit, I also intended to change wait_event_killable to wait_event_timeout. Andrey > > Oleg. >