On 05/01/2018 10:35 AM, Oleg Nesterov wrote: > On 04/30, Andrey Grodzovsky wrote: >> On 04/30/2018 12:00 PM, Oleg Nesterov wrote: >>> On 04/30, Andrey Grodzovsky wrote: >>>> What about changing PF_SIGNALED to PF_EXITING in >>>> drm_sched_entity_do_release >>>> >>>> -      if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL) >>>> +     if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL) >>> let me repeat, please don't use task->exit_code. And in fact this check is racy >>> >>> But this doesn't matter. Say, we can trivially add SIGNAL_GROUP_KILLED_BY_SIGKILL, >>> or do something else, >> Can you explain where is the race and what is a possible alternative then ? > Oh. I mentioned this race automatically, because I am pedant ;) Let me repeat > that this doesn't really matter, and let me remind that the caller of fop->release > can be completely unrelated process, say $cat /proc/pid/fdinfo. And in any case > ->exit_code should not be used outside of ptrace/exit paths. > > OK, the race. Consider a process P with a main thread M and a sub-thread T. > > T does pthread_exit(), enters do_exit() and gets a preemption before exit_files(). > > The process is killed by SIGKILL. M calls do_group_exit(), do_exit() and passes > exit_files(). However, it doesn't call close_files() because T has another reference. > > T resumes, calls close_files(), fput(), etc, and then exit_task_work(), so > it can finally call ->release() with current->exit_code == 0 desptite the fact > the process was killed. > > Again, again, this doesn't matter. We can distinguish killed-or-not, by SIGKILL- > or-not. But I still do not think we actually need this. At least in ->release() > paths, ->flush() may differ. Hi Oleg, reworked the code to use .flush hook and eliminated wait_event_killable. So in case of .flush is this OK to look at task->exit_code == SIGKILL to distinguish exit by signal ? Andrey > > Oleg. >