Re: RFC: Radeon multi ring support branch

Christian König <deathsimple@xxxxxxxxxxx> · Fri, 18 Nov 2011 13:19:08 +0100

On 17.11.2011 17:58, Jerome Glisse wrote:
2011/11/17 Christian König<deathsimple@xxxxxxxxxxx>:
On 16.11.2011 01:24, Jerome Glisse wrote:
Well as we don't specify on which value semaphore should wait on, i am
prety sure the first ring to increment the semaphore will unblock all
waiter. So if you have ring1 that want to wait on ring2 and ring3 as soon as
ring2 or ring3 is done ring1 will go one while either ring2 or ring3 might
not be done. I will test that tomorrow but from doc i have it seems so. Thus
it will be broken with more than one ring, that would mean you have to
allocate one semaphore for each ring couple you want to synchronize. Note
that the usual case will likely be sync btw 2 ring.
Good point, but I played with it a bit more today and it is just behaving as
I thought it would be. A single signal command will just unblock a single
waiter, even if there are multiple waiters currently for this semaphore, the
only thing you can't tell is which waiter will come first.
I am not talking about multiple waiter but about single waiter on multilple
signaler. Current implementation allocate one and only one semaphore
not matter how much ring you want to synchronize with. Which means
the single waiter will be unblock once only a single ring signal, which is
broken if you want to wait for several rings.
Wait a moment, having a single semaphore doesn't means that there is 
only a single waiter, just take a look at the code again:

        for (i = 0; i < RADEON_NUM_RINGS; ++i) {
                /* no need to sync to our own or unused rings */
                if (i == p->ring || !p->sync_to_ring[i])
                        continue;

                if (!p->ib->fence->semaphore) {
                        r = radeon_semaphore_create(p->rdev, 
&p->ib->fence->semaphore);
                        if (r)
                                return r;
                }

                radeon_semaphore_emit_signal(p->rdev, i, 
p->ib->fence->semaphore);
                radeon_semaphore_emit_wait(p->rdev, p->ring, 
p->ib->fence->semaphore); <--------
        }

It is allocating a single semaphore, thats correct, but at the same time 
it emits multiple wait commands. So if we want to synchronize with just 
one ring, we only wait once on the semaphore, but if we want to 
synchronize with N rings we also wait N times on the same semaphore. 
Since we don't really care about the order in which the signal 
operations arrive that is pretty much ok, because when execution 
continues after the last wait command it has been made sure that every 
signal operation has been fired. The extended semaphore test in my 
repository is also checking for this condition, and it really seems to 
work fine.

I should also note that the current algorithm will just emit multiple wait
operations to a single ring, and spread the signal operations to all other
rings we are interested in. That isn't very efficient, but should indeed
work quite fine.
Well the cs patch you posted don't do that. It create one semaphore
and emit a wait with that semaphore and emit signal to all ring. That
was my point. But i agree that creating a semaphore for each ring
couple you want to wait for will work.

After retesting the first patch  drm/radeon: fix debugfs handling is NAK
a complete no go.

Issue is that radeon_debugfs_cleanup is call after rdev is free. This
is why i used a static array. I forgot about that, i should have put a
comment. I guess you built your kernel without debugfs or that you
didn't tested to reload the module.
Mhm, I have tested it, seen the crash, and didn't thought that this is a
problem. Don't ask me why I can't understand it myself right now.

Anyway, I moved the unregistering of the files into a separate function,
which is now called from radeon_device_fini instead of
radeon_debugfs_cleanup. That seems to work fine, at least if I haven't
missed something else.
Please don't touch the debugfs stuff, it's fine the way it is. No need to
change anything there.
At least for me there is need to change something, because using a 
static variable causes the debugfs files to only appear on the first 
card in the system. And in my configuration the first card is always the 
onboard APU, so I wondered for half a day why a locked up IB doesn't 
contains any data, until i finally recognized that I'm looking at the 
wrong GPU in the system.

I also merged your indention fixes and the fix for the never allocated
semaphores and pushed the result into my public repository
(http://cgit.freedesktop.org/~deathsimple/linux), so please take another
look at it.

I will take a look

Thanks, but please be aware that I'm going to also integrate Alex 
patches ontop of it today and also give it another test round, just to 
make sure that I'm not doing anything stupid with the debugfs code. So 
if you haven't done already please wait a bit more.

Thanks,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel