Comment # 5
on bug 100712
from Julien Isorce
(In reply to Michel Dänzer from comment #4) > (In reply to Julien Isorce from comment #0) > > In kernel radeon_object.c::radeon_bo_list_validate, once "bytes_moved > > > bytes_moved_threshold" is reached (this is the case for 850 bo in the same > > list_for_each_entry loop), I can see that radeon_ib_schedule emits a fence > > that it takes more than the radeon.lockup_timeout to be signaled. > > radeon_ib_schedule is called for submitting the command stream from > userspace, not for any BO moves directly, right? > > How did you determine that this hang is directly related to bytes_moved / > bytes_moved_threshold? Maybe it's only indirectly related, e.g. due to the > threshold preventing a BO from being moved to VRAM despite userspace's > preference. > I added a trace and the fence that is not signaled on time is always the one emited by radeon_ib_schedule after that the bytes_moved_threshold is reached. But you are right it could be only indirectly related. Here is the sequence I have: ioctl_radeon_cs radeon_bo_list_validate bytes_moved > bytes_moved_threshold(=1024*1024ull) 800 bo are not moved from gtt to vram because of that. radeon_cs_ib_vm_chunk radeon_ib_schedule(rdev, &parser->ib, NULL, true); radeon_fence_emit on ring 0 r600_mmio_hdp_flush /ioctl_radeon_cs Then anything calling ttm_bo_wait will block more than the radeon.lockup_timeout because the above fence is not signaled on time. Could it be that something is not flushed properly ? (ref: https://patchwork.kernel.org/patch/5807141/ ? tlb_flush ?) Are you saying that some bos are required to be moved from gtt to vram in order for this fence to be signaled ? As you can see above it happens when vram_usage >= half_vram so radeon_bo_get_threshold_for_moves returns 1024*1024, which explains why only 1 or 2 bos can be moved from gtt to vram in that case and why all others are forced to stay in gtt. In the same run of radeon_bo_list_validate there are many calls to ttm_bo_validate with both domain and current_domain as VRAM, this is the case for around 400 bo. Maybe this cause delay for this fence to be signaled, providing vram usage is high too. > > > Also it seems the fence is signaled by swapper after more than 10 seconds > > but it is too late. I requires to reduce the "15" param above to 4 to see > > that. > > How does "swapper" (what is that exactly?) signal the fence? My wording was wrong sorry, I should have said "the first entity noticing that the fence is signaled" by calling radeon_fence_activity. swapper is the name for process 0 (idle). I change drm logging to print process name and id: (current->comm, current->pid) > > It might be worth looking into why this happens, though. If domain == > current_domain == RADEON_GEM_DOMAIN_VRAM, I wouldn't expect ttm_bo_validate > to trigger a blit. I will check though I think I get just confused by a previous trace.
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel