Hi Andres, With your this patch, OCLperf hung. Could you explain more? If I am correctly, the difference of with and without this patch is setting first two queue or setting all queues of pipe0 to queue_bitmap. Then UMD can use different number queue to submit command for compute selected by amdgpu_queue_mgr_map. I checked amdgpu_queue_mgr_map implementation, CS_IOCTL can map user ring to different hw ring depending on busy or idle, right? If yes, I see a bug in it, which will result in our sched_fence not work. Our sched fence assumes the job will be executed in order, your mapping queue breaks this. Regards, David Zhou On 2017å¹´09æ??27æ?¥ 00:22, Andres Rodriguez wrote: > A performance regression for OpenCL tests on Polaris11 had this feature > disabled for all asics. > > Instead, disable it selectively on the affected asics. > > Signed-off-by: Andres Rodriguez <andresx7 at gmail.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > index 4f6c68f..3d76e76 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > @@ -109,9 +109,20 @@ void amdgpu_gfx_parse_disable_cu(unsigned *mask, unsigned max_se, unsigned max_s > } > } > > +static bool amdgpu_gfx_is_multipipe_capable(struct amdgpu_device *adev) > +{ > + /* FIXME: spreading the queues across pipes causes perf regressions > + * on POLARIS11 compute workloads */ > + if (adev->asic_type == CHIP_POLARIS11) > + return false; > + > + return adev->gfx.mec.num_mec > 1; > +} > + > void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev) > { > int i, queue, pipe, mec; > + bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev); > > /* policy for amdgpu compute queue ownership */ > for (i = 0; i < AMDGPU_MAX_COMPUTE_QUEUES; ++i) { > @@ -125,8 +136,7 @@ void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev) > if (mec >= adev->gfx.mec.num_mec) > break; > > - /* FIXME: spreading the queues across pipes causes perf regressions */ > - if (0) { > + if (multipipe_policy) { > /* policy: amdgpu owns the first two queues of the first MEC */ > if (mec == 0 && queue < 2) > set_bit(i, adev->gfx.mec.queue_bitmap);