Re: AMD GPU new API for new module

Christian König <deathsimple@xxxxxxxxxxx> · Wed, 08 Oct 2014 18:43:52 +0200

Hi Jerome,

yeah that's what we have already planned for the IOCTLs anyway.

The main question currently is rather how we do the fence representation 
and synchronization between different engines.

For fences I think we can agree to use 64bit values (maybe plus engine 
index) for internal representation and android style fence fds for 
sharing them between applications/drivers (assuming the android fences 
ever leave staging).

But what should we do for inter engine synchronization? One requirement 
from our closed source teams is that we have semaphore/mutex style 
synchronization, e.g. fences are only used for GPU to CPU sync, but for 
GPU to GPU sync you have a different sync object type.

Semaphores work like having a mutex that protects resources from 
concurrent access, e.g. we can do two command submissions that need 
access to buffer A and we don't care which command submission runs first 
as long as they don't run at the same time.

The obvious down side is that you inherit problems like lock inversions 
and so you can build deadlocks with them, e.g. command submission A 
needs to wait for B and B needs to wait for A. Which are hard to detect 
and resolve in the kernel except for using timeouts.

Ideas or thoughts on that?

Regards,
Christian.

Am 08.10.2014 um 18:00 schrieb Jerome Glisse:
Hi,

So if i do not start the discussion now it might be already too late. Given
plan to converge open source driver and closed source driver to use a single
common kernel driver and that this would be a new kernel driver. This is an
opportunity to fix some of the radeon design issues (at least things that i
would have done differently if only i could get some gas for my DeLorean).

Among the thing that i will not do is the chunk stuff associated with cs
ioctl. I find it ugly, if my memory serve me well i was trying to be future
proof and allow the cs ioctl to be extended. While this original aim have
been somewhat successfully, i think it was the wrong day to do it.

My lastest (and what i still believe to be a good idea until proven wrong),
is to change the way we do ioctl and use a little trick. This idea was also
spark by the continuous additions we do to info ioctl which is getting ugly.

So idea is simple, each ioctl would use some struct like :

struct radeon_ioctl {
	u32	version;
	u32	size;
};

The version field is the key here, think of it as an index into an array of
ioctl dispatch functions. So something like :

struct radeon_ioctls {
	int (*iotcl)[MAX_IOCTL_NUM](void *data, ...);
};

struct radeon_ioctls rdispatch_ioctls[N];

And now all ioctl go through this single entry point :

int radeon_ioctl_stub(int ioctl, void *data, ...)
{
	struct radeon_ioctl *rio = data;

	return rdispatch_ioctls[rio->version][ioctl](data, ...);
}

So this is rough idea but the point is that we can do proper ioctl versioning
and have separate functions for each new versions and avoid ioctl cruft, or
at least this is the theory.

The two things this gave us, is feedback from userspace as we the version the
kernel will know which version of userspace it is dealing with. The others one
is that it does allow you to introduce a completely new API either for new
generation of hardware or just as an evolution. And small bonus is that it
does allow to slowly phase out API that we consider broken (ioctl per ioctl).

So this is the main design change that i would do. I should probably rought
up something that goes deeper for the cs ioctl.

Cheers,
Jérôme
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel