Re: [PATCH 3/4] drm/radeon: consolidate uvd/vce initialization, resume and suspend.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 16.03.2016 um 16:56 schrieb Jerome Glisse:
On Wed, Mar 16, 2016 at 04:19:16PM +0100, Christian König wrote:
Am 16.03.2016 um 15:59 schrieb Jerome Glisse:
On Wed, Mar 16, 2016 at 2:03 PM, Christian König
<deathsimple@xxxxxxxxxxx> wrote:
Am 16.03.2016 um 13:48 schrieb Jérôme Glisse:
From: Jérome Glisse <jglisse@xxxxxxxxxx>

This consolidate uvd/vce into a common shape for all generation. It
also leverage the rdev->has_uvd flags to know what it is useless to
try to resume/suspend uvd/vce block.

There is no functional changes when there is no error. On error the
device driver will behave differently than before after this patch.
It should now safely ignore uvd/vce errors and keeps normal operation
of others engine. This is an improvement over current situation where
we have different behavior depending on GPU generation and on what
fails.

Finaly this is a preparatory step for a patch which allow to disable
uvd/vce as a driver option.

This have only been tested on southern island so please test it on
other generations (i do not have hardware handy for now).

Signed-off-by: Jérôme Glisse <jglisse@xxxxxxxxxx>
Cc: Alex Deucher <alexander.deucher@xxxxxxx>
Cc: Christian König <christian.koenig@xxxxxxx>
NAK, skipping UVD and VCE suspend/resume when initialization fails should
already be implemented.

There might be some (quite some) bugs in there, but that doesn't justify
reworking the initialization over all different generations. Especially
since you don't have hardware to test all of them.

Just make sure that radeon_uvd/vce_fini() is called when something goes
wrong and/or that the UVD/VCE BO is properly released.

Regards,
Christian.
Current code is a mess when it comes to handling error related to
uvd/vce. This patch consolidate control flow in something easy to
follow. You can check that there is absolulety no control flow change
for the case where uvd/vce works and thus it does not break anything
that works. It will only gracefully fails and cleanup if things go
wrong. So while i have not tested on other hw i am confident that this
does not introduce regression.

I tried to do it without consolidation but it ended up in adding even
more if() levels that line did begins after 80colums. So please
reconsider because this is an improvement over existing code.
Well then please point out at the example of the SI or CIK code what exactly
is missing here.
Going from :
	if (rdev->has_uvd) {
		r = uvd_v2_2_resume(rdev);
		if (!r) {
			r = radeon_fence_driver_start_ring(rdev,
							   R600_RING_TYPE_UVD_INDEX);
			if (r)
				dev_err(rdev->dev, "UVD fences init error (%d).\n", r);
		}
		if (r)
			rdev->ring[R600_RING_TYPE_UVD_INDEX].ring_size = 0;
	}
	r = radeon_vce_resume(rdev);
	if (!r) {
		r = vce_v1_0_resume(rdev);
		if (!r)
			r = radeon_fence_driver_start_ring(rdev,
							   TN_RING_TYPE_VCE1_INDEX);
		if (!r)
			r = radeon_fence_driver_start_ring(rdev,
							   TN_RING_TYPE_VCE2_INDEX);
	}
	if (r) {
		dev_err(rdev->dev, "VCE init error (%d).\n", r);
		rdev->ring[TN_RING_TYPE_VCE1_INDEX].ring_size = 0;
		rdev->ring[TN_RING_TYPE_VCE2_INDEX].ring_size = 0;
	}


To:
	r = uvd_v2_2_resume(rdev);
	if (r)
		goto error;
	r = radeon_fence_driver_start_ring(rdev, R600_RING_TYPE_UVD_INDEX);
	if (r)
		goto error_uvd;
	r = radeon_vce_resume(rdev);
	if (r)
		goto error_uvd;
	r = vce_v1_0_resume(rdev);
	if (r)
		goto error_vce;
	r = radeon_fence_driver_start_ring(rdev, TN_RING_TYPE_VCE1_INDEX);
	if (r)
		goto error_vce;
	r = radeon_fence_driver_start_ring(rdev, TN_RING_TYPE_VCE2_INDEX);
	if (r)
		goto error_vce;
	return;
error_vce:
	radeon_vce_suspend(rdev);
error_uvd:
	radeon_uvd_suspend(rdev);
error:
	dev_err(rdev->dev, "UVD/VCE startup error (%d).\n", r);
	/* On error just disable everything. */
	radeon_vce_fini(rdev);
	radeon_uvd_fini(rdev);
	rdev->ring[R600_RING_TYPE_UVD_INDEX].ring_size = 0;
	rdev->ring[TN_RING_TYPE_VCE1_INDEX].ring_size = 0;
	rdev->ring[TN_RING_TYPE_VCE2_INDEX].ring_size = 0;

And as I said that is exactly what you should NOT be doing here. Once the firmware is loaded the block should be kept in that state.

Freeing the memory allocated for the firmware is also not a good idea at all because we don't know who exactly is accessing it.



Is lot more clear to me than bunch of intertwine if/else. A clear error path
for which you do not have to jump through if level to see what get executed
or not on error. The only difference is that it does tie uvd and vce together.
I did that on purpose because on the hw i am playing with the vce seems to be
useless when the uvd block fails (opposite seems to be true too). If you think
we should still try to init vce when uvd fails or uvd when vce fails i can
split uvd and vce.

UVD and VCE are two completely separate blocks, they shouldn't be related to each other in anyway.

When you see failures of both at the same time it's rather unlikely that it is actually related to them.


The other difference with existing code is that i free resources normaly use
uvd/vce on error (free fw buffer). This is just me trying to free resource
early and it has no impact as block are not working.

-----------------------------------------------------------------------------------

Second part we go from:
	if (rdev->has_uvd) {
		ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
		if (ring->ring_size) {
			r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
					     RADEON_CP_PACKET2);
			if (!r)
				r = uvd_v1_0_init(rdev);
			if (r)
				DRM_ERROR("radeon: failed initializing UVD (%d).\n", r);
		}
	}
	r = -ENOENT;
	ring = &rdev->ring[TN_RING_TYPE_VCE1_INDEX];
	if (ring->ring_size)
		r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
				     VCE_CMD_NO_OP);
	ring = &rdev->ring[TN_RING_TYPE_VCE2_INDEX];
	if (ring->ring_size)
		r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
				     VCE_CMD_NO_OP);
	if (!r)
		r = vce_v1_0_init(rdev);
	else if (r != -ENOENT)
		DRM_ERROR("radeon: failed initializing VCE (%d).\n", r);


To:
	ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
	r = radeon_ring_init(rdev, ring, ring->ring_size, 0, RADEON_CP_PACKET2);
	if (r)
		goto error;
	r = uvd_v1_0_init(rdev);
	if (r)
		goto error_uvd;
	ring = &rdev->ring[TN_RING_TYPE_VCE1_INDEX];
	r = radeon_ring_init(rdev, ring, ring->ring_size, 0, VCE_CMD_NO_OP);
	if (r)
		goto error_vce1;
	ring = &rdev->ring[TN_RING_TYPE_VCE2_INDEX];
	r = radeon_ring_init(rdev, ring, ring->ring_size, 0, VCE_CMD_NO_OP);
	if (r)
		goto error_vce2;
	r = vce_v1_0_init(rdev);
	if (r)
		goto error_vce;
	return;
error_vce:
	radeon_ring_fini(rdev, &rdev->ring[TN_RING_TYPE_VCE2_INDEX]);
error_vce2:
	radeon_ring_fini(rdev, &rdev->ring[TN_RING_TYPE_VCE1_INDEX]);
error_vce1:
	uvd_v1_0_fini(rdev);
error_uvd:
	radeon_ring_fini(rdev, &rdev->ring[R600_RING_TYPE_UVD_INDEX]);
error:
	dev_err(rdev->dev, "UVD/VCE resume error (%d).\n", r);
	/* On error just disable everything. */
	radeon_uvd_suspend(rdev);
	radeon_vce_suspend(rdev);
	radeon_uvd_fini(rdev);
	radeon_vce_fini(rdev);
	rdev->ring[R600_RING_TYPE_UVD_INDEX].ring_size = 0;
	rdev->ring[TN_RING_TYPE_VCE1_INDEX].ring_size = 0;
	rdev->ring[TN_RING_TYPE_VCE2_INDEX].ring_size = 0;

Again lot simpler to follow control flow than to jump through various level
of if/else. Again uvd and vce tied together (and again i can untie them if
you think it is better to untie them).

But this time the extra thing is that i properly disable ring if any error
happens while existing code does not.

And again that is exactly what we should NOT do.

When initialization fails we don't know in which state the ring buffer and micro engines are, so freeing them and giving the space back to be reused in clearly not a good idea.

All we should do is clearing the ready flag when something fails to prevent userspace from making command submissions to the failed engine.



I am not pasting the init path but it the same logic, tying uvd and vce
together and simplifying error code path.



Please also note that VCE/UVD has dependencies on power management, so that
when they are once initialized they should NOT be turned off again.

I only briefly skimmed over your patch, but it actually looks like to me
that you broken that by trying to cleanup the initialization routine.
I have seen that but assuming Heisenbergs does not get involve, then given
that the block is not responding to register write it is unlikely that thing
will we worse if we try to disable the block. And from my testing it does
not impact power management. My guess is that the block keep reporting it is
busy and that power gating and clock gating are inhibited by that.

It's more complicated than that just a simple busy signal. The engines actively communicate with the power management controller to tell them their needs and limits for the clocks based on the workload they have.

Once initialized the power management controller expects the UVD and VCE micro-controllers to answer such requests.

Failing to do so can get you stuck at a specific power level.


The other thing i am doing over existing code is freeing memory for the fw
buffer. I do not think it is a big deal. I am doing that because then i just
flag the uvd has dead (rdev->has_uvd = 0) and avoid to try to restore it
for next suspend/resume cycle or hibernation cycle.

So again the only thing i am change is the case where thing does not work.
With that patch i can actualy hibernate laptop and get back a working desktop
module video decoding/encoding no longer working. I call that an improvement.

It's nice that it works for you now, but my laptop is working fine with UVD and VCE as well and I would like to keep it that way.

As far as I can see you're actually messing the error handling up quite a bit here instead of improving it.

So please describe in detail what the problems you are seeing and why disabling both UVD and VCE helps with them.

A kernel log from a failed suspend/resume cycle would help quite a bit here.

Regards,
Christian.


Jérôme

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux