On 07/06/2018 02:26 AM, S, Shirish wrote:
On 7/5/2018 11:11 PM, Harry Wentland wrote:
On 2018-07-05 12:09 PM, Andrey Grodzovsky wrote:
On 07/05/2018 12:06 PM, Greg KH wrote:
On Thu, Jul 05, 2018 at 12:05:09PM -0400, Andrey Grodzovsky wrote:
This patch is wrong as noted by MIchel a while ago - quote from
his review
of the patch.
"Actually, pflip_status should really only be set to
AMDGPU_FLIP_SUBMITTED
after the flip has been programmed to the hardware, at least as
far as the
lock holder is concerned. That's why the code was previously
holding the
lock until after the dc_commit_updates_for_stream call. Otherwise,
it's at
least theoretically possible that either:
* dm_pflip_high_irq is called before dc_commit_updates_for_stream,
but sees
flip_status == AMDGPU_FLIP_SUBMITTED and sends the event to userspace
prematurely
* dm_pflip_high_irq is called after dc_commit_updates_for_stream,
but sees
flip_status != AMDGPU_FLIP_SUBMITTED, so it incorrectly doesn't
send the
event to userspace "
It shouldn't go in.
Is there a fix for this in Linus's tree for these problems?
If not, why not? If so, what is that git commit id?
AFAIK there is still no fix for the original issue which this patch
was intended to fix.
Andrey,
Ideally sleep's should not be coded within spinlock block, in this
case its coded in dc_commit_updates_for_stream() at the below
occurrences:
1. In drivers/gpu/drm/amd/amdgpu/atom.c =>
amdgpu_atom_execute_table_locked() => kzalloc
2. mutex callbacks in drivers/gpu/drm/amd/powerplay/amd_powerplay.c &
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c.
3. In drivers/gpu/drm/amd/display/dc/core/dc.c => dc_create_state() =>
kzalloc
In my case, when rendering on multiple planes, it led to system
instabilities and with this patch applied we no more see them.
(analysis further below in this mail).
Also when posted on amd-gfx with the corresponding RB's there was no
objection to this patch not sure if nobody saw this patch getting merged.
Harry, can you please comment on this ?
That can be better answered by Shirish.
In my opinion the issue described by Michel is a lesser evil than
attempting to allocate memory inside a spinlocked area.
Yes Harry. I agree, hence i pushed this patch as we have a code that
attempts to allocate memory and furthermore calls mutex's.
Shirish, we need to understand why dc_commit_updates_for_stream() is
trying to do a UPDATE_TYPE_FULL in your scenario. This should never
really happen through the amdgpu_dm_do_flip code path and we need to
understand why.
This happens in amdgpu_dm_do_flip() because of the hard-coded number
of surfaces it call dc_commit_updates_for_stream() with,
dc_commit_updates_for_stream(adev->dm.dc,
surface_updates,
1, <<<<<<<<<
acrtc_state->stream,
NULL,
&surface_updates->surface,
state);
which is 1, where as in case of rendering on multiple planes the
stream has 2 surfaces attached and hence the below condition fails in
very first check of check_update_surfaces_for_stream()
"if ( ... || stream_status->plane_count != surface_count)"
If we never do UPDATE_TYPE_FULL here we should not need to allocate
memory and should be find with leaving the spinlock around
dc_commit_updates_for_stream().
As i understand we may end-up trying to do a UPDATE_TYPE_FULL due to
failure in any of the several checks in
dc_check_update_surfaces_for_stream().
We cannot rule out the possibility of a code path being executed as
long as it exists, ideally we should code another callback in
amdgpu_dm_flip() that presumes the update type is always
UPDATE_TYPE_FAST.
Till we fix this hard-coding i recommend that we should have this
patch, am open to avoiding it if it breaks existing working systems or
offends any builds.
Thanks.
Regards,
Shirish S
I believe that at least then we should put a BIG comment in the code
explaining this change both to avoid future people looking into it
trying to revert it
believing it's just a plain bug and for possible proper revert when the
underlying issue with sleeping in atomic context is resolved.
Andrey
Harry
Andrey
thanks,
greg k-h