Re: [PATCH 1/2] igt/gem_exec_nop: add burst submission to parallel execution test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18/08/16 16:36, Dave Gordon wrote:
On 18/08/16 16:27, Dave Gordon wrote:

[snip]

Note that SKL GuC firmware 6.1 didn't support dual submission or lite
restore, whereas the next version (8.11) does. Therefore, with that
firmware we don't see the same slowdown when going to 1-at-a-time
round-robin. I have a different (new) test that shows this more clearly.

This is with GuC version 6.1:

skylake# ./intel-gpu-tools/tests/gem_exec_paranop | fgrep -v SUCCESS

Time to exec 8-byte batch:      3.428µs (ring=render)
Time to exec 8-byte batch:      2.444µs (ring=bsd)
Time to exec 8-byte batch:      2.394µs (ring=blt)
Time to exec 8-byte batch:      2.615µs (ring=vebox)
Time to exec 8-byte batch:      2.625µs (ring=all, sequential)
Time to exec 8-byte batch:     12.701µs (ring=all, parallel/1) ***
Time to exec 8-byte batch:      7.259µs (ring=all, parallel/2)
Time to exec 8-byte batch:      4.336µs (ring=all, parallel/4)
Time to exec 8-byte batch:      2.937µs (ring=all, parallel/8)
Time to exec 8-byte batch:      2.661µs (ring=all, parallel/16)
Time to exec 8-byte batch:      2.245µs (ring=all, parallel/32)
Time to exec 8-byte batch:      1.626µs (ring=all, parallel/64)
Time to exec 8-byte batch:      2.170µs (ring=all, parallel/128)
Time to exec 8-byte batch:      1.804µs (ring=all, parallel/256)
Time to exec 8-byte batch:      2.602µs (ring=all, parallel/512)
Time to exec 8-byte batch:      2.602µs (ring=all, parallel/1024)
Time to exec 8-byte batch:      2.607µs (ring=all, parallel/2048)

And for comparison, here are the figures with v8.11:

# ./intel-gpu-tools/tests/gem_exec_paranop | fgrep -v SUCCESS

Time to exec 8-byte batch:	  3.458µs (ring=render)
Time to exec 8-byte batch:	  2.154µs (ring=bsd)
Time to exec 8-byte batch:	  2.156µs (ring=blt)
Time to exec 8-byte batch:	  2.156µs (ring=vebox)
Time to exec 8-byte batch:	  2.388µs (ring=all, sequential)
Time to exec 8-byte batch:	  5.897µs (ring=all, parallel/1)
Time to exec 8-byte batch:	  4.669µs (ring=all, parallel/2)
Time to exec 8-byte batch:	  4.278µs (ring=all, parallel/4)
Time to exec 8-byte batch:	  2.410µs (ring=all, parallel/8)
Time to exec 8-byte batch:	  2.165µs (ring=all, parallel/16)
Time to exec 8-byte batch:	  2.158µs (ring=all, parallel/32)
Time to exec 8-byte batch:	  1.594µs (ring=all, parallel/64)
Time to exec 8-byte batch:	  1.583µs (ring=all, parallel/128)
Time to exec 8-byte batch:	  2.473µs (ring=all, parallel/256)
Time to exec 8-byte batch:	  2.264µs (ring=all, parallel/512)
Time to exec 8-byte batch:	  2.357µs (ring=all, parallel/1024)
Time to exec 8-byte batch:	  2.382µs (ring=all, parallel/2048)

All generally slightly faster, but parallel/1 is approximately twice as fast, while parallel/64 is virtually unchanged, as are all the timings for large batches.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux