> 2020年4月9日 22:59,Koenig, Christian <Christian.Koenig@xxxxxxx> 写道: > >> Why we break out the loops when there are pending bos to be released? > > We do this anyway if we can't acquire the necessary locks. Freeing already deleted BOs is just a very lazy background work. That is true. eviction will reclaim the BO resource too. > >> So it did not break anything with this patch I think. > > Oh, the patch will certainly work. I'm just not sure if it's the ideal behavior. > >> https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026 >> >> This is another example of the usage of cond_sched. > > Yes, and that is also a good example of what I mean here: > >> if (!mutex_trylock(&slab_mutex)) >> >> >> /* Give up. Setup the next iteration. */ >> >> >> goto out; > > If the function can't acquire the lock immediately it gives up and waits for the next iteration. > > I think it would be better if we do this in TTM as well if we spend to much time cleaning up old BOs. fair enough. > > On the other hand you are right that cond_resched() has the advantage that we could spend more time on cleaning up old BOs if there is nothing else for the CPU TODO. > > Regards, > Christian. > > Am 09.04.20 um 16:24 schrieb Pan, Xinhui: >> https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026 >> >> This is another example of the usage of cond_sched. >> From: Pan, Xinhui <Xinhui.Pan@xxxxxxx> >> Sent: Thursday, April 9, 2020 10:11:08 PM >> To: Lucas Stach <l.stach@xxxxxxxxxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx> >> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx> >> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker >> >> I think it doesn't matter if workitem schedule out. Even we did not schedule out, the workqueue itself will schedule out later. >> So it did not break anything with this patch I think. >> From: Pan, Xinhui <Xinhui.Pan@xxxxxxx> >> Sent: Thursday, April 9, 2020 10:07:09 PM >> To: Lucas Stach <l.stach@xxxxxxxxxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx> >> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx> >> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker >> >> Why we break out the loops when there are pending bos to be released? >> >> And I just checked the process_one_work. Right after the work item callback is called, the workqueue itself will call cond_resched. So I think >> From: Koenig, Christian <Christian.Koenig@xxxxxxx> >> Sent: Thursday, April 9, 2020 9:38:24 PM >> To: Lucas Stach <l.stach@xxxxxxxxxxxxxx>; Pan, Xinhui <Xinhui.Pan@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> >> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx> >> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker >> >> Am 09.04.20 um 15:25 schrieb Lucas Stach: >> > Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König: >> >> Am 09.04.20 um 03:31 schrieb xinhui pan: >> >>> The delayed delete list is per device which might be very huge. And in >> >>> a heavy workload test, the list might always not be empty. That will >> >>> trigger any RCU stall warnings or softlockups in non-preemptible kernels >> >>> Lets do schedule out if possible in that case. >> >> Mhm, I'm not sure if that is actually allowed. This is called from a >> >> work item and those are not really supposed to be scheduled away. >> > Huh? Workitems can schedule out just fine, otherwise they would be >> > horribly broken when it comes to sleeping locks. >> >> Let me refine the sentence: Work items are not really supposed to be >> scheduled purposely. E.g. you shouldn't call schedule() or >> cond_resched() like in the case here. >> >> Getting scheduled away because we wait for a lock is of course perfectly >> fine. >> >> > The workqueue code >> > even has measures to keep the workqueues at the expected concurrency >> > level by starting other workitems when one of them goes to sleep. >> >> Yeah, and exactly that's what I would say we should avoid here :) >> >> In other words work items can be scheduled away, but they should not if >> not really necessary (e.g. waiting for a lock). >> >> Otherwise as you said new threads for work item processing are started >> up and I don't think we want that. >> >> Just returning from the work item and waiting for the next cycle is most >> likely the better option. >> >> Regards, >> Christian. >> >> > >> > Regards, >> > Lucas >> > >> >> Maybe rather change the while into while (!list_empty(&bdev->ddestroy) >> >> && !should_reschedule(0)). >> >> >> >> Christian. >> >> >> >>> Signed-off-by: xinhui pan <xinhui.pan@xxxxxxx> >> >>> --- >> >>> drivers/gpu/drm/ttm/ttm_bo.c | 1 + >> >>> 1 file changed, 1 insertion(+) >> >>> >> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >> >>> index 9e07c3f75156..b8d853cab33b 100644 >> >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >> >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >> >>> @@ -541,6 +541,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all) >> >>> } >> >>> >> >>> ttm_bo_put(bo); >> >>> + cond_resched(); >> >>> spin_lock(&glob->lru_lock); >> >>> } >> >>> list_splice_tail(&removed, &bdev->ddestroy); >> >> _______________________________________________ >> >> dri-devel mailing list >> >> dri-devel@xxxxxxxxxxxxxxxxxxxxx >> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cchristian.koenig%40amd.com%7C0a47486676a74702f05408d7dc89839c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220355504145868&sdata=wbRkYBPI6mYuZjKBtQN3AGLDOwqJlWY3XUtwwSiUQHg%3D&reserved=0 >> > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel