> It is not really true to say that the code is duplicated. > > Beignet maps OpenCL into GPGPU pipeline of IvyBridge+ hardware. > So this is real GPGPU other than mimic GPGPU with 3D functions. Nobody mimics GPGPU now with 3D functions. This hasn't been necessary since DX11 hardware came out, and some DX10.1 hw maybe. > > GPGPU different with 3D pipeline a lot on IvyBridge+. > Both the pipeline setting and run time are totally different than that in 3D driver. > The GPU thread spawn model, thread communication model, memory model are also totally different. This is also true mostly for other GPUs, there is only a single shader stage, and setup for it is quite different. I'm assuming IvyBridge is based around DX11 Compute so I doubt its that much different. The gallium compute interface is not the same as that gallium 3D interface, you could quite easily have written a compute only gallium driver in the time it would take to even start writing this. I realise Ben left this code drop, but you guys could have at least spent some time checking out what else was out there. > Also the binary representation is different. > Ben choose LLVM scalar IR for many reasons(you can find the > decision make reason in the document), that means IR backend are different. Again not really important, we use LLVM and TGSI backends in the radeon drivers now. The gallium drivers currently use GPU specific LLVM IRs produced from specific LLVM backends, but that isn't strictly necessary just leads to better optimising opportunities. > For GPGPU programming, I don't see a lot benefit to introduce state tracker. > There is not so many states to track. Because you now have a whole bunch of code that is useless to anyone else. Distro want to ship this sort of thing, we don't want 5 Mesa like implementations for OpenCL, we want one we can actually distribute and manage, and maybe in 5 years support. > The project is already a functional OpenCL implementation on IvyBridge at this point. > > 1. Most of the language features are supported. > 2. Most of built-in functions are supported. > 3. Global, Local memory, thread barriers are supported. > 3. OpenGL to OpenCL texture sharing are supported. > > We have already implement something like CSS filters with this driver, > and we see performance gain than OpenGL filters. Some other questions: a) have you got an ivybridge LLVM backend? are you going to upstream this, I heard it isn't even written in LLVM machine description format. b) have you looked at pocl, libclc etc? Maybe you guys want to run on CPU as well at GPU at some point, in which case maybe again looking around before jumping into implementing stuff might help. c) does this use the open source ICD at least? Dave.