Am 03.08.2018 um 20:41 schrieb Andrey
Grodzovsky:
On 08/06/2018 08:44 AM, Christian
König wrote:
Am 03.08.2018 um 16:54 schrieb
Andrey Grodzovsky:
[SNIP]
>
> Second of all, even after we removed the
entity from rq in
> drm_sched_entity_flush to terminate any
subsequent submissions
>
> to the queue the other thread doing push job
can just acquire again a
> run queue
>
> from drm_sched_entity_get_free_sched and
continue submission
Hi Christian
That is actually desired.
When another process is now using the entity to
submit jobs adding it
back to the rq is actually the right thing to do
cause the entity is
still in use.
Yes, no problem if it's another process. But what about
another thread from same process ? Is it a possible use
case that 2 threads from same process submit to same
entity concurrently ? If so and we specifically kill one,
the other will not stop event if we want him to because
current code makes him just require a rq for him self.
Well you can't kill a single thread of a process (you can
only interrupt it), a SIGKILL will always kill the whole
process.
Is the following scenario possible and acceptable ?
2 threads from same process working on same queue where thread
A currently in drm_sched_entity_flush->wait_event_timeout
(the process getting shut down because of SIGKILL sent by
user)
while thread B still inside drm_sched_entity_push_job before
'if (reschedule)'. 'A' stopped waiting because queue became
empty and then removes the entity queue from scheduler's run
queue while
B goes inside 'reschedule' because it evaluates to true
('first' is true and all the rest of the conditions), acquires
new rq, and later adds it back to scheduler (different one
maybe) and keeps submitting jobs as much as he likes and then
can be stack for up to 'timeout' time in his
drm_sched_entity_flush waiting for them.
I'm not 100% sure but I don't think that can happen.
See flushing the fd is done while dropping the fd, which happens
only after all threads of the process in question are killed.
Yea, this FDs handling is indeed a lot of gray area for me but as
far as I remember flushing is done per each thread when exits
(possibly due to a signal).
Now signals interception and processing (as a result of which
.flush will get called if SIGKILL received) is done in some points
amongst which is when returning from IOCTL.
So if first thread was at the very end of the CS ioctl when
SIGKILL was received while the other one at the beginning then I
think we might see something like the scenario above.
SIGKILL isn't processed as long as any thread of the application is
still inside the kernel. That's why we have wait_event_killable().
So I don't think that the scenario above is possible, but I'm really
not 100% sure either.
Christian.
Andrey
Otherwise the flushing wouldn't make to much sense. In other
words imagine an application where a thread does a write() on a
fd which is killed.
The idea of the flush is to preserve the data and that won't
work if that isn't correctly ordered.
My
understanding was that introduction of entity->last is to
force immediate termination job submissions by any thread from
the terminating process.
We could consider reordering that once more. Going to play out
all scenarios in my head over the weekend :)
Christian.
Andrey
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel
|
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel