[PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

ebiederm@xxxxxxxxxxxx (Eric W. Biederman) · Mon, 30 Apr 2018 11:25:47 -0500

Christian KÃ¶nig <ckoenig.leichtzumerken at gmail.com> writes:

> Hi Eric,
>
> sorry for the late response, was on vacation last week.
>
> Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
>> Andrey Grodzovsky <Andrey.Grodzovsky at amd.com> writes:
>>
>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>>> able to exit immediately
>>>>> and not wait for GPU jobs completion when the reason for reaching this code
>>>>> is because of KILL
>>>>> signal to the user process who opened the device file.
>>>> Can you hook f_op->flush method?
>
> THANKS! That sounds like a really good idea to me and we haven't investigated
> into that direction yet.

For the backwards compatibility concerns you cite below the flush method
seems a much better place to introduce the wait.  You at least really
will be in a process context for that.  Still might be in exit but at
least you will be legitimately be in a process.

>>> But this one is called for each task releasing a reference to the the file, so
>>> not sure I see how this solves the problem.
>> The big question is why do you need to wait during the final closing a
>> file?
>
> As always it's because of historical reasons. Initially user space pushed
> commands directly to a hardware queue and when a processes finished we didn't
> need to wait for anything.
>
> Then the GPU scheduler was introduced which delayed pushing the jobs to the
> hardware queue to a later point in time.
>
> This wait was then added to maintain backward compability and not break
> userspace (but see below).

That make sense.

>> The wait can be terminated so the wait does not appear to be simply a
>> matter of correctness.
>
> Well when the process is killed we don't care about correctness any more, we
> just want to get rid of it as quickly as possible (OOM situation etc...).
>
> But it is perfectly possible that a process submits some render commands and
> then calls exit() or terminates because of a SIGTERM, SIGINT etc.. In this case
> we need to wait here to make sure that all rendering is pushed to the hardware
> because the scheduler might need resources/settings from the file
> descriptor.
>
> For example if you just remove that wait you could close firefox and get garbage
> on the screen for a millisecond because the remaining rendering commands where
> not executed.
>
> So what we essentially need is to distinct between a SIGKILL (which means stop
> processing as soon as possible) and any other reason because then we don't want
> to annoy the user with garbage on the screen (even if it's just for a few
> milliseconds).

I see a couple of issues.

- Running the code in release rather than in flush.

Using flush will catch every close so it should be more backwards
compatible.  f_op->flush always runs in process context so looking at
current makes sense.

- Distinguishing between death by SIGKILL and other process exit deaths.

In f_op->flush the code can test "((tsk->flags & PF_EXITING) &&
(tsk->code == SIGKILL))" to see if it was SIGKILL that terminated
the process.

- Dealing with stuck queues (where this patchset came in).

For stuck queues you are going to need a timeout instead of the current
indefinite wait after PF_EXITING is set.  From what you have described a
few milliseconds should be enough.  If PF_EXITING is not set you can
still just make the wait killable and skip the timeout if that will give
a better backwards compatible user experience.

What can't be done is try and catch SIGKILL after a process has called
do_exit.  A dead process is a dead process.

Eric