On 17.03.2015 07:32, Alex Deucher wrote: > On Thu, Mar 12, 2015 at 10:55 PM, Michel Dänzer <michel@xxxxxxxxxxx> wrote: >> On 12.03.2015 22:09, Alex Deucher wrote: >>> On Thu, Mar 12, 2015 at 5:23 AM, Christian König >>> <deathsimple@xxxxxxxxxxx> wrote: >>>> On 12.03.2015 10:02, Michel Dänzer wrote: >>>>> >>>>> On 12.03.2015 06:14, Alex Deucher wrote: >>>>>> >>>>>> On Wed, Mar 11, 2015 at 4:51 PM, Alex Deucher <alexdeucher@xxxxxxxxx> >>>>>> wrote: >>>>>>> >>>>>>> On Wed, Mar 11, 2015 at 2:21 PM, Christian König >>>>>>> <deathsimple@xxxxxxxxxxx> wrote: >>>>>>>> >>>>>>>> On 11.03.2015 16:44, Alex Deucher wrote: >>>>>>>>> >>>>>>>>> radeon_bo_create() calls radeon_ttm_placement_from_domain() >>>>>>>>> before ttm_bo_init() is called. radeon_ttm_placement_from_domain() >>>>>>>>> uses the ttm bo size to determine when to select top down >>>>>>>>> allocation but since the ttm bo is not initialized yet the >>>>>>>>> check is always false. >>>>>>>>> >>>>>>>>> Noticed-by: Oded Gabbay <oded.gabbay@xxxxxxx> >>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> >>>>>>>>> Cc: stable@xxxxxxxxxxxxxxx >>>>>>>> >>>>>>>> >>>>>>>> And I was already wondering why the heck the BOs always made this >>>>>>>> ping/pong >>>>>>>> in memory after creation. >>>>>>>> >>>>>>>> Patch is Reviewed-by: Christian König <christian.koenig@xxxxxxx> >>>>>>> >>>>>>> And fixing that promptly broke VCE due to vram location requirements. >>>>>>> Updated patch attached. Thoughts? >>>>>> >>>>>> And one more take to make things a bit more explicit for static kernel >>>>>> driver allocations. >>>>> >>>>> struct ttm_place::lpfn is honoured even with TTM_PL_FLAG_TOPDOWN, so >>>>> latter should work with RADEON_GEM_CPU_ACCESS. It sounds like the >>>>> problem is really that some BOs are expected to be within a certain >>>>> range from the beginning of VRAM, but lpfn isn't set accordingly. It >>>>> would be better to fix that by setting lpfn directly than indirectly via >>>>> RADEON_GEM_CPU_ACCESS. >>>> >>>> >>>> Yeah, agree. We should probably try to find the root cause of this instead. >>>> >>>> As far as I know VCE has no documented limitation on where buffers are >>>> placed (unlike UVD). So this is a bit strange. Are you sure that it isn't >>>> UVD which breaks here? >>> >>> It's definitely VCE, I don't know why UVD didn't have a problem. I >>> considered using pin_restricted to make sure it got pinned in the CPU >>> visible region, but that had two problems: 1. it would end up getting >>> migrated when pinned, >> >> Maybe something like radeon_uvd_force_into_uvd_segment() is needed for >> VCE as well? >> >> >>> 2. it would end up at the top of the restricted >>> region since the top down flag is set which would end up fragmenting >>> vram. >> >> If that's an issue (which outweighs the supposed benefit of >> TTM_PL_FLAG_TOPDOWN), then again the proper solution would be not to set >> TTM_PL_FLAG_TOPDOWN when rbo->placements[i].lpfn != 0 and smaller than >> the whole available region, instead of checking for VRAM and >> RADEON_GEM_CPU_ACCESS. >> > > How about something like the attached patch? I'm not really sure > about the restrictions for the UVD and VCE fw and stack/heap buffers, > but this seems to work. It seems like the current UVD/VCE code works > by accident since the check for TOPDOWN fails. This patch is getting a bit messy, mixing several logically separate changes. Can you split it up accordingly? E.g. one patch just adding the new fpfn and lpfn function parameters but passing 0 for them (so no functional change), then one or several patches with the corresponding functional changes, and finally one patch adding the new size parameter (and thus making TTM_PL_FLAG_TOPDOWN actually used for newly allocated BOs). I think that would help for reviewing and generally understanding the changes. > @@ -105,14 +106,17 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain) > */ > if ((rbo->flags & RADEON_GEM_NO_CPU_ACCESS) && > rbo->rdev->mc.visible_vram_size < rbo->rdev->mc.real_vram_size) { > - rbo->placements[c].fpfn = > - rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT; > + if (fpfn > (rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT)) > + rbo->placements[c].fpfn = fpfn; > + else > + rbo->placements[c].fpfn = > + rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT; > rbo->placements[c++].flags = TTM_PL_FLAG_WC | > TTM_PL_FLAG_UNCACHED | > TTM_PL_FLAG_VRAM; > } If (fpfn >= rbo->rdev->mc.visible_vram_size), this whole block can be skipped, since the next placement will be identical. OTOH, fpfn is currently always 0 anyway, so maybe it's better not to add that parameter in the first place. Other than that, looks good to me. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel