Am 18.01.19 um 18:34 schrieb Grodzovsky, Andrey: > > On 01/18/2019 12:10 PM, Koenig, Christian wrote: >> Am 18.01.19 um 16:21 schrieb Grodzovsky, Andrey: >>> On 01/18/2019 04:25 AM, Koenig, Christian wrote: >>>> [SNIP] >>>>>>>> Re-arming the timeout should probably have a much reduced value >>>>>>>> when the job hasn't changed. E.g. something like a few ms. >>>>> Now i got thinking about non hanged job in progress (job A) and let's >>>>> say it's a long job , it just started executing but due to time out of >>>>> another job (job B) on another (or this scheduler) it's parent cb got >>>>> disconnected, we disarmed the tdr timer for the job's scheduler, >>>>> meanwhile the timed out job did manage to complete before HW reset >>>>> check and hence we skip HW reset, attach back the cb and rearm job's A >>>>> tdr timer with a future value of few ms only - aren't we going to get >>>>> false tdr triggered on job B now because we didn't let it enough time >>>>> to run and complete ? I would prefer the other extreme of longer time >>>>> for time out to trigger then false TDR. Optimally we would have per >>>>> job timer and rearm to exactly the reminder of it's time out value - >>>>> but we gave up on per job tdr work long ago. >>>> Well we only re-arm the timeout with a shorter period if it already >>>> triggered once. If we just suspend the timeout then we should still use >>>> the longer period. >>> Can you explain more on this ? I don't get it. >> See drm_sched_job_timedout(), we re-arm the timeout at the end of the >> procedure. >> >> We should change that and re-arm the timer with a much lower timeout if >> the job is still not finished. >> >> Christian. > I still don't see how this can fix the problem of of long job in > progress triggering false tdr if no HW reset was done, but maybe I am > missing other pieces you have in mind, I will finish the patch and send > it and then we can be more specific based on the code. Ok sounds good. We should probably discuss less on details and prototype a bit more. Might be that I'm missing something here as well, so probably good to have some code to talk about things more directly. Christian. > > Andrey > >>> Andrey >>> >>>>> In general the more i think about it (correct me if I am wrong) I am >>>>> less sure how much the optimization feature is useful - if job's time >>>>> out did trigger what are the chances that the little more time we give >>>>> it between beginning of tdr function and the time we do start the >>>>> actual HW reset will be exactly what it needed to complete. Also, this >>>>> is still not water proof as the job might complete and signal it's HW >>>>> fence exactly after we checked for completion but before starting the >>>>> HW reset code. >>>> I don't see this as an optimization, but rather as mandatory for correct >>>> operation. >>>> >>>> See without this we can run into issues because we execute jobs multiple >>>> times. That can still happen with this clean handling, but it is much >>>> more unlikely. >>>> >>>> Christian. >>>> >>>>> Andrey >>>>> >>>>>>> By unchanged you mean when we didn't resubmit the job because of the >>>>>>> optimized non HW reset, right ? >>>>>> Correct, yes. >>>>>> >>>>>>>>> About flushing tdr jobs in progress from .free_job cb - looks like >>>>>>>>> drm_sched_job_finish->cancel_delayed_work_sync is not enough, we >>>>>>>>> still need to take care of flushing all sced->work_tdr for a >>>>>>>>> device and for all devices in hive for XGMI. >>>>>>>>> What do you think ? >>>>>>>> Why should that be necessary? We only wait for the delayed work to >>>>>>>> make sure that the job is not destroyed while dealing with it. >>>>>>>> >>>>>>>> Christian. >>>>>>> But we might not be waiting for the correct sched->work_tdr, we do >>>>>>> the reset routine for all schedulers in a device accessing their >>>>>>> jobs too and not only for the scheduler to which the job belongs. >>>>>>> For XGMI not only that, we reset all the devices in the hive. >>>>>> That is harmless you only need to wait for the work_tdr of the >>>>>> current scheduler, not for all of them. >>>>>> >>>>>>> I was thinking, amdgpu driver is not even interested in allowing >>>>>>> multiple sced->tdr to execute together - we have to serialize all of >>>>>>> them anyway with the trylock mutex (even without XGMI), v3d in >>>>>>> v3d_job_timedout seems also to reset all of his schedulers from the >>>>>>> tdr work. Would it make sense to provide the sched->work_td as init >>>>>>> parameter to scheduler (same one for all schedulers) so we can >>>>>>> enforce serialization by disallowing more then 1 tdr work to execute >>>>>>> in the same time ? Other drivers interested to do in parallel can >>>>>>> provide unique sched->work_tdr per scheduler. This does imply >>>>>>> drm_sched_job_timedout has to removed and delegated to specific >>>>>>> driver implementation as probably other code dealing with >>>>>>> sched->work_tdr... Maybe even move tdr handling to the driver all >>>>>>> together ? >>>>>> Yeah, I was thinking something similar. The problem with this >>>>>> approach is that a delayed work item can have only one delay, but for >>>>>> multiple engines we need multiple delays. >>>>>> >>>>>> What we could do is to make it a timer instead and raise the work >>>>>> item from the device specific callback. >>>>>> >>>>>> But that doesn't really saves us the stop all schedulers trouble, so >>>>>> it doesn't buy us much in the end if I see this correctly. >>>>>> >>>>>> Christian. >>>> _______________________________________________ >>>> amd-gfx mailing list >>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx