Hello, When profiling my workload on an AMD E-350 (PALM GPU) to see why it still wasn't performing well with Jerome's WIP macrotiling patches, I noticed that r600_fence_finish was taking 10% of my CPU time. I determined experimentally that changing from sched_yield() to os_time_sleep(10) fixed this and resolved my last performance issue on AMD Fusion as compared to Intel Atom, but felt that this was hacky. I've therefore tried to use INT_SEL of 0b10 in the EVENT_WRITE_EOP in Mesa, combined with a new ioctl to wait for a changed value, but it's not working the way I would expect. I'll be sending patches as replies to this message, so that you can see exactly what I've done, but in brief, I have an ioctl that uses wait_event to wait for a chosen offset in a BO to change value. I've added a suitable waitqueue, and made radeon_fence_process call wake_up_all. I'm seeing behaviour from this that I can't explain; as you'll see in the patches, I've moved some IRQ prints from DRM_DEBUG to printk(KERN_INFO), and I'm seeing that I don't get the EOP interrupt in a timely fashion - either because memory is not as coherent between the GPU and CPU as I would like (so I'm reading stale data when I call wait_event), or because the interrupt is genuinely delayed. From dmesg (with commentary): X11 and my GL compositor start. [ 84.423567] IH: CP EOP [ 84.423600] Woke kernel fences [ 84.423606] Waking up all waiters This looks like an EOP for a kernel-side fence. [ 84.651320] wait_user_fence offset 4 value 0 timeout -1 [ 84.651332] Current data value 0 This is my compositor coming in via my new ioctl, to wait for EOP. I get bored of waiting for the ioctl to complete, and send the compositor SIGSTOP then SIGCONT. [ 149.970629] wait_user_fence offset 4 value 0 timeout -1 [ 149.970635] Finished data value 1 My new ioctl completes, as the data has changed value. I was expecting an EOP interrupt before this, which hasn't arrived - why? [ 150.224675] wait_user_fence offset 8 value 0 timeout -1 [ 150.224692] Current data value 0 The compositor comes in again, waiting on a different fence. [ 150.250166] IH: CP EOP [ 150.250194] Woke kernel fences [ 150.250198] Waking up all waiters This looks like an EOP for a kernel-side fence. [ 150.250212] IH: CP EOP [ 150.250216] Waking up all waiters This looks like an EOP for the fence that completed at time 149.970 - why's it been delayed? [ 150.250219] IH: CP EOP [ 150.250221] Waking up all waiters And another EOP that I can't tie up to command buffers. [ 150.250327] IH: CP EOP [ 150.250335] Woke kernel fences [ 150.250337] Waking up all waiters Kernel fence. [ 150.250559] IH: CP EOP [ 150.250567] Woke kernel fences [ 150.250570] Waking up all waiters Another kernel fence. [ 150.250581] IH: CP EOP [ 150.250583] Waking up all waiters [ 150.250595] wait_user_fence offset 8 value 0 timeout -1 [ 150.250604] IH: CP EOP [ 150.250608] Waking up all waiters [ 150.250615] Finished data value 1 Two user fence EOPs in a row, one of which woke up my process. [ 150.251462] IH: CP EOP [ 150.251477] Woke kernel fences [ 150.251480] Waking up all waiters Kernel fence. [ 150.256806] wait_user_fence offset 0 value 0 timeout -1 [ 150.256828] Current data value 0 Stalled again, waiting for EOP interrupt. Could be because the GPU and CPU have different views of memory at this point. There will be two patches in reply to this mail - one is the Mesa patch, one the kernel patch; I would greatly appreciate help getting this going. -- Simon Farnsworth Software Engineer ONELAN Limited http://www.onelan.com/
Attachment:
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel