On Tue, Apr 07, 2015 at 06:59:59PM +0200, Christian Gmeiner wrote: > Hi Lucas. > > 2015-04-07 17:29 GMT+02:00 Lucas Stach <l.stach@xxxxxxxxxxxxxx>: > > And I don't get why each core needs to have a single device node. IMHO > > this is purely an implementation decision weather to have one device > > node for all cores or one device node per core. > > It is an important decision. And I think that one device node per core > reflects the hardware design to 100%. Since when do the interfaces to userspace need to reflect the hardware design? Isn't the point of having a userspace interface, in part, to abstract the hardware design details and provide userspace with something that is relatively easy to use without needlessly exposing the variation of the underlying hardware? Please get away from the idea that userspace interfaces should reflect the hardware design. > What makes harder to get it right? The needed changes to the kernel > driver are not that hard. The user space is an other story but thats > because of the render-only thing, where we need to pass (prime) > buffers around and do fence syncs etc. In the end I do not see a > showstopper in the user space. The fence syncs are an issue when you have multiple cores - that's something I started to sort out in my patch series, but when you appeared to refuse to accept some of the patches, I stopped... The problem when you have multiple cores is one global fence event counter which gets compared to the fence values in each buffer object no longer works. Consider this scenario: You have two threads, thread A making use of a 2D core, and thread B using the 3D core. Thread B submits a big long render operation, and the buffers get assigned fence number 1. Thread A submits a short render operation, and the buffers get assigned fence number 2. The 2D core finishes, and sends its interrupt. Etnaviv updates the completed fence position to 2. At this point, we believe that fence numbers 1 and 2 are now complete, despite the 3D core continuing to execute and operate on the buffers with fence number 1. I'm certain that the fence implementation we currently have can't be made to work with multiple cores with a few tweeks - we need something better to cater for what is essentially out-of-order completion amongst the cores. A simple resolution to that _would_ be your argument of exposing each GPU as a separate DRM node, because then we get completely separate accounting of each - but it needlessly adds an expense in userspace. Userspace would have to make multiple calls - to each GPU DRM node - to check whether the buffer is busy on any of the GPUs as it may not know which GPU could be using the buffer, especially if it got it via a dmabuf fd sent over the DRI3 protocol. To me, that sounds like a burden on userspace. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel