Am 09.10.2017 um 13:27 schrieb Nicolai Hähnle: > On 09.10.2017 13:12, Christian König wrote: >>> >>>> Nicolai, how hard would it be to handle ENODEV as failure for all >>>> currently existing contexts? >>> >>> Impossible? "All currently existing contexts" is not a well-defined >>> concept when multiple drivers co-exist in the same process. >> >> Ok, let me refine the question: I assume there are resources "shared" >> between contexts like binary shader code for example which needs to >> be reuploaded when VRAM is lost. >> >> How hard would it be to handle that correctly? > > Okay, that makes more sense :) > > With the current interface it's still pretty difficult, but if we > could get a new per-device query ioctl which returns a "VRAM loss > counter", it would be reasonably straight-forward. The problem with the VRAM lost counter is that this isn't save either. E.g. you could have an application which uploads shaders, a GPU reset happens and VRAM is lost and then the application creates a new context and makes submission with broken shader binaries. So I would still vote for a separate IOCTL to reset the VRAM lost state which is called *before" user space starts to reupload shader/descriptors etc... This way you also catch the case when another reset happens while you reupload things. > >>> And what would be the purpose of this? If it's to support VRAM loss, >>> having a per-context VRAM loss counter would enable each context to >>> signal ECANCELED separately. >> >> I thought of that on top of the -ENODEV handling. >> >> In other words when we see -ENODEV we call an IOCTL to let the kernel >> know we noticed that something is wrong and then reinit all shared >> resources in userspace. >> >> All existing context will still see -ECANCELED when we drop their >> command submission, but new contexts would at least not cause a new >> lockup immediately because their shader binaries are corrupted. > > I don't think we need -ENODEV for this. We just need -ECANCELED to be > returned when a submission is rejected due to reset (hang or VRAM loss). > > Mesa would keep a shadow of the VRAM loss counter in pipe_screen and > pipe_context, and query the kernel's counter when it encounters > -ECANCELED. Each context would then know to drop the CS it's built up > so far and restart based on comparing the VRAM loss counter of > pipe_screen to that of pipe_context, and similarly we could keep a > copy of the VRAM loss counter for important buffer objects like shader > binaries, descriptors, etc. > > This seems more robust to me than relying only on an ENODEV. We'd most > likely keep some kind of VRAM loss counter in Mesa *anyway* (we don't > maintain a list of all shaders, for example, and we can't nuke > important per-context across threads), and synthesizing such a counter > from ENODEVs is not particularly robust (what if multiple ENODEVs > occur for the same loss event?). > > BTW, I still don't like ENODEV. It seems more like the kind of error > code you'd return with hot-pluggable GPUs where the device can > physically disappear... Yeah, ECANCELED sounds like a better alternative. But I think we should still somehow note the fatality of loosing VRAM to userspace. How about ENODATA or EBADFD? Regards, Christian. > > Cheers, > Nicolai > > >> >> Regards, >> Christian. >> >> Am 09.10.2017 um 13:04 schrieb Nicolai Hähnle: >>> On 09.10.2017 12:59, Christian König wrote: >>>> Nicolai, how hard would it be to handle ENODEV as failure for all >>>> currently existing contexts? >>> >>> Impossible? "All currently existing contexts" is not a well-defined >>> concept when multiple drivers co-exist in the same process. >>> >>> And what would be the purpose of this? If it's to support VRAM loss, >>> having a per-context VRAM loss counter would enable each context to >>> signal ECANCELED separately. >>> >>> Cheers, >>> Nicolai >>> >>> >>>> >>>> Monk, would it be ok with you when we return ENODEV only for >>>> existing context when VRAM is lost and/or we have a strict mode GPU >>>> reset? E.g. newly created contexts would continue work as they should. >>>> >>>> Regards, >>>> Christian. >>>> >>>> Am 09.10.2017 um 12:49 schrieb Nicolai Hähnle: >>>>> Hi Monk, >>>>> >>>>> Yes, you're right, we're only using ECANCELED internally. But as a >>>>> consequence, Mesa would already handle a kernel error of ECANCELED >>>>> on context loss correctly :) >>>>> >>>>> Cheers, >>>>> Nicolai >>>>> >>>>> On 09.10.2017 12:35, Liu, Monk wrote: >>>>>> Hi Christian >>>>>> >>>>>> You reject some of my patches that returns -ENODEV, with the >>>>>> cause that MESA doesn't do the handling on -ENODEV >>>>>> >>>>>> But if Nicolai can confirm that MESA do have a handling on >>>>>> -ECANCELED, then we need to overall align our error code, on >>>>>> detail below IOCTL can return error code: >>>>>> >>>>>> Amdgpu_cs_ioctl >>>>>> Amdgpu_cs_wait_ioctl >>>>>> Amdgpu_cs_wait_fences_ioctl >>>>>> Amdgpu_info_ioctl >>>>>> >>>>>> >>>>>> My patches is: >>>>>> return -ENODEV on cs_ioctl if the context is detected guilty, >>>>>> also return -ENODEV on cs_wait|cs_wait_fences if the fence is >>>>>> signaled but with error -ETIME, >>>>>> also return -ENODEV on info_ioctl so UMD can query if gpu reset >>>>>> happened after the process created (because for strict mode we >>>>>> block process instead of context) >>>>>> >>>>>> >>>>>> according to Nicolai: >>>>>> >>>>>> amdgpu_cs_ioctl *can* return -ECANCELED, but to be frankly >>>>>> speaking, kernel part doesn't have any place with "-ECANCELED" so >>>>>> this solution on MESA side doesn't align with *current* amdgpu >>>>>> driver, >>>>>> which only return 0 on success or -EINVALID on other error but >>>>>> definitely no "-ECANCELED" error code, >>>>>> >>>>>> so if we talking about community rules we shouldn't let MESA >>>>>> handle -ECANCELED , we should have a unified error code >>>>>> >>>>>> + Marek >>>>>> >>>>>> BR Monk >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: Haehnle, Nicolai >>>>>> Sent: 2017å¹´10æ??9æ?¥ 18:14 >>>>>> To: Koenig, Christian <Christian.Koenig at amd.com>; Liu, Monk >>>>>> <Monk.Liu at amd.com>; Nicolai Hähnle <nhaehnle at gmail.com>; >>>>>> amd-gfx at lists.freedesktop.org >>>>>> Subject: Re: [PATCH 5/5] drm/amd/sched: signal and free remaining >>>>>> fences in amd_sched_entity_fini >>>>>> >>>>>> On 09.10.2017 10:02, Christian König wrote: >>>>>>>> For gpu reset patches (already submitted to pub) I would make >>>>>>>> kernel >>>>>>>> return -ENODEV if the waiting fence (in cs_wait or wait_fences >>>>>>>> IOCTL) >>>>>>>> founded as error, that way UMD would run into robust extension >>>>>>>> path >>>>>>>> and considering the GPU hang occurred, >>>>>>> Well that is only closed source behavior which is completely >>>>>>> irrelevant for upstream development. >>>>>>> >>>>>>> As far as I know we haven't pushed the change to return -ENODEV >>>>>>> upstream. >>>>>> >>>>>> FWIW, radeonsi currently expects -ECANCELED on CS submissions and >>>>>> treats those as context lost. Perhaps we could use the same error >>>>>> on fences? >>>>>> That makes more sense to me than -ENODEV. >>>>>> >>>>>> Cheers, >>>>>> Nicolai >>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Christian. >>>>>>> >>>>>>> Am 09.10.2017 um 08:42 schrieb Liu, Monk: >>>>>>>> Christian >>>>>>>> >>>>>>>>> It would be really nice to have an error code set on >>>>>>>>> s_fence->finished before it is signaled, use >>>>>>>>> dma_fence_set_error() >>>>>>>>> for this. >>>>>>>> For gpu reset patches (already submitted to pub) I would make >>>>>>>> kernel >>>>>>>> return -ENODEV if the waiting fence (in cs_wait or wait_fences >>>>>>>> IOCTL) >>>>>>>> founded as error, that way UMD would run into robust extension >>>>>>>> path >>>>>>>> and considering the GPU hang occurred, >>>>>>>> >>>>>>>> Don't know if this is expected for the case of normal process >>>>>>>> being >>>>>>>> killed or crashed like Nicolai hit ... since there is no gpu >>>>>>>> hang hit >>>>>>>> >>>>>>>> >>>>>>>> BR Monk >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] On >>>>>>>> Behalf Of Christian K?nig >>>>>>>> Sent: 2017å¹´9æ??28æ?¥ 23:01 >>>>>>>> To: Nicolai Hähnle <nhaehnle at gmail.com>; >>>>>>>> amd-gfx at lists.freedesktop.org >>>>>>>> Cc: Haehnle, Nicolai <Nicolai.Haehnle at amd.com> >>>>>>>> Subject: Re: [PATCH 5/5] drm/amd/sched: signal and free remaining >>>>>>>> fences in amd_sched_entity_fini >>>>>>>> >>>>>>>> Am 28.09.2017 um 16:55 schrieb Nicolai Hähnle: >>>>>>>>> From: Nicolai Hähnle <nicolai.haehnle at amd.com> >>>>>>>>> >>>>>>>>> Highly concurrent Piglit runs can trigger a race condition >>>>>>>>> where a >>>>>>>>> pending SDMA job on a buffer object is never executed because the >>>>>>>>> corresponding process is killed (perhaps due to a crash). >>>>>>>>> Since the >>>>>>>>> job's fences were never signaled, the buffer object was >>>>>>>>> effectively >>>>>>>>> leaked. Worse, the buffer was stuck wherever it happened to be at >>>>>>>>> the time, possibly in VRAM. >>>>>>>>> >>>>>>>>> The symptom was user space processes stuck in interruptible waits >>>>>>>>> with kernel stacks like: >>>>>>>>> >>>>>>>>> [<ffffffffbc5e6722>] dma_fence_default_wait+0x112/0x250 >>>>>>>>> [<ffffffffbc5e6399>] dma_fence_wait_timeout+0x39/0xf0 >>>>>>>>> [<ffffffffbc5e82d2>] >>>>>>>>> reservation_object_wait_timeout_rcu+0x1c2/0x300 >>>>>>>>> [<ffffffffc03ce56f>] >>>>>>>>> ttm_bo_cleanup_refs_and_unlock+0xff/0x1a0 >>>>>>>>> [ttm] >>>>>>>>> [<ffffffffc03cf1ea>] ttm_mem_evict_first+0xba/0x1a0 [ttm] >>>>>>>>> [<ffffffffc03cf611>] ttm_bo_mem_space+0x341/0x4c0 [ttm] >>>>>>>>> [<ffffffffc03cfc54>] ttm_bo_validate+0xd4/0x150 [ttm] >>>>>>>>> [<ffffffffc03cffbd>] ttm_bo_init_reserved+0x2ed/0x420 >>>>>>>>> [ttm] >>>>>>>>> [<ffffffffc042f523>] >>>>>>>>> amdgpu_bo_create_restricted+0x1f3/0x470 >>>>>>>>> [amdgpu] >>>>>>>>> [<ffffffffc042f9fa>] amdgpu_bo_create+0xda/0x220 [amdgpu] >>>>>>>>> [<ffffffffc04349ea>] amdgpu_gem_object_create+0xaa/0x140 >>>>>>>>> [amdgpu] >>>>>>>>> [<ffffffffc0434f97>] amdgpu_gem_create_ioctl+0x97/0x120 >>>>>>>>> [amdgpu] >>>>>>>>> [<ffffffffc037ddba>] drm_ioctl+0x1fa/0x480 [drm] >>>>>>>>> [<ffffffffc041904f>] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] >>>>>>>>> [<ffffffffbc23db33>] do_vfs_ioctl+0xa3/0x5f0 >>>>>>>>> [<ffffffffbc23e0f9>] SyS_ioctl+0x79/0x90 >>>>>>>>> [<ffffffffbc864ffb>] entry_SYSCALL_64_fastpath+0x1e/0xad >>>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff >>>>>>>>> >>>>>>>>> Signed-off-by: Nicolai Hähnle <nicolai.haehnle at amd.com> >>>>>>>>> Acked-by: Christian König <christian.koenig at amd.com> >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 7 ++++++- >>>>>>>>> 1 file changed, 6 insertions(+), 1 deletion(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>>>>>>>> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>>>>>>>> index 54eb77cffd9b..32a99e980d78 100644 >>>>>>>>> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>>>>>>>> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>>>>>>>> @@ -220,22 +220,27 @@ void amd_sched_entity_fini(struct >>>>>>>>> amd_gpu_scheduler *sched, >>>>>>>>> amd_sched_entity_is_idle(entity)); >>>>>>>>> amd_sched_rq_remove_entity(rq, entity); >>>>>>>>> if (r) { >>>>>>>>> struct amd_sched_job *job; >>>>>>>>> /* Park the kernel for a moment to make sure it isn't >>>>>>>>> processing >>>>>>>>> * our enity. >>>>>>>>> */ >>>>>>>>> kthread_park(sched->thread); >>>>>>>>> kthread_unpark(sched->thread); >>>>>>>>> - while (kfifo_out(&entity->job_queue, &job, sizeof(job))) >>>>>>>>> + while (kfifo_out(&entity->job_queue, &job, >>>>>>>>> sizeof(job))) { >>>>>>>>> + struct amd_sched_fence *s_fence = job->s_fence; >>>>>>>>> + amd_sched_fence_scheduled(s_fence); >>>>>>>> It would be really nice to have an error code set on >>>>>>>> s_fence->finished before it is signaled, use >>>>>>>> dma_fence_set_error() for this. >>>>>>>> >>>>>>>> Additional to that it would be nice to note in the subject line >>>>>>>> that >>>>>>>> this is a rather important bug fix. >>>>>>>> >>>>>>>> With that fixed the whole series is Reviewed-by: Christian König >>>>>>>> <christian.koenig at amd.com>. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Christian. >>>>>>>> >>>>>>>>> + amd_sched_fence_finished(s_fence); >>>>>>>>> + dma_fence_put(&s_fence->finished); >>>>>>>>> sched->ops->free_job(job); >>>>>>>>> + } >>>>>>>>> } >>>>>>>>> kfifo_free(&entity->job_queue); >>>>>>>>> } >>>>>>>>> static void amd_sched_entity_wakeup(struct dma_fence *f, >>>>>>>>> struct >>>>>>>>> dma_fence_cb *cb) >>>>>>>>> { >>>>>>>>> struct amd_sched_entity *entity = >>>>>>>>> container_of(cb, struct amd_sched_entity, cb); >>>>>>>>> entity->dependency = NULL; >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> amd-gfx mailing list >>>>>>>> amd-gfx at lists.freedesktop.org >>>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>>>>>> _______________________________________________ >>>>>>>> amd-gfx mailing list >>>>>>>> amd-gfx at lists.freedesktop.org >>>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >