Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 18.05.2018 um 17:02 schrieb Andrey Grodzovsky:


On 05/18/2018 10:50 AM, Christian König wrote:
Am 18.05.2018 um 16:44 schrieb Michel Dänzer:
On 2018-05-18 11:42 AM, Christian König wrote:
Anyway, the kernel can't rely on userspace using O_CLOEXEC. If the flush
callback being called from multiple processes is an issue, maybe the
flush callback isn't appropriate after all.
Userspace could also grab a reference just by opening /proc/$pid/fd/*.

The idea is just that when any process which used the fd is killed by a signal we drop the remaining jobs from being submitted to the hardware.
This must only affect jobs submitted by the killed process, not those
submitted by other processes.

Yeah, that's exactly the plan here.

I don't see how it's gong to happen -
.flush is being called for any terminating process regardless if he submitted jobs or just accidentally (or not)  has the device file FD in his private file table. So here we going to have a problem with that requirement. If a process is being killed and .flush is executed I don't have any way to know which  amdgpu_ctx to chose to terminate it's pending jobs.
The only info i have from .flush caller is the process id.
As it's now in amdgpu_ctx_mgr_entity_fini and amdgpu_ctx_mgr_entity_cleanup we are going to iterate all the contextes from the context manager list and terminate them all, which sounds wrong to me indeed. I can save the pid of the context creator on the context structure so i can match during .flush call, but in case some one creates the context but passes the context id to another process for actual job submission this approach won't work either.

Am I messing something here ?

Your analyses is correct, it's just that I think that this case should not happen.

What can happen is that the fd is passed accidentally to child processes and those child processes are then killed, but passing the fd to child processes is a bug in the first place.

When somebody on purpose opens the fd and kills the process then it breaks and he can keep the pieces. I mean to open the fd you need to be privileged anyway.

What we could do to completely fix the issue:
1. Note for each submitted job which process (pid) it submitted.
2. During flush wait or kill only jobs of the current process.

But I think that this is overkill.

Christian.


Andrey


For additional security we could safe the pid of the job submitter, but since this should basically not happen in normal operation I would rather like to avoid that.

Christian.


_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux