> I've done a very cursory read of this, and my original comment from my > original high-level review on the internal list still stands: I'm freaked out by how > invasive this is into the existing ring code. All the changes in i915_dma.c look > very suspicious, since that code is for the legacy ums crap and will _never_ run > on bdw. Nor even on anything more modern than g4x platforms (gen4). > > Apparently that review has been lost/ignored, so I'll quote it in full: Back in March 2013 I wasn´t involved on Execlists (or, FWIW, i915) at all so, yes, I´m afraid it got lost. Sorry :( > "In reading through your patches the big thing which jumped out is how the > new execlist code is intermingled with the old legacy gen8 framebuffer stuff. > Imo those two modes don't match at all, and we need to resolve this mismatch > with another abstraction layer ;-) IMHO, I have minimized quite a lot the number of conflicts (very few "if execlists enabled" scattered over the legacy code remeining). Rather than "being intermingled", I would say it now "smoothly rides" on the legacy stuff. If you give it a look with fresh eyes, maybe you will like it? > "I'm thinking of dev_priv->gt.do_execbuf which takes care of all the lower-level > request tracking and command submission. Feel free to massively bikeshed the > name. I'm thinking that we should move everything from > i915_gem_execbuffer_move_to_gpu to > i915_gem_execbuffer_retire_commands into that callback. With the current > code that'd include the active list tracking, maybe we should move that part out > again. Otoh if we go wild with scheduling and preemption, active bo tracking > _will_ be rather different from previous platforms. To support execlist we > might need some more vfuncs in the ringbuffer struct to support execlist > specific stuff (submit execlist, enable context switch interrupts), but a lot of the > existing stuff will be redudant. > At the end (once things settles) we should then document which kind of > do_execbuf uses which kinds of low-level ring interfaces. I´m afraid I would still need all the currently existing low-level ring interfaces, plus a bunch of new ones :( (do_execbuf is not the only user of low-level ring interfaces, see below). > "With that abstraction: > - We can separate gen8 execlist from legacy gen8 code, and so should avoid > regressions (and so blocking mesa). > - Play around with different execlist approaches (guc, deferred execution, > whatever, ...) since it'll just be differen copy&pasta. Ok, yes, I understand what you say. Playing around with other submission approaches will become harder and harder if we reuse too much legacy code. But can´t we do it as a cleanup on top of this code? > "Finally I think our immediate focus for execlist enabling should be to get multi- > context execlists going, so that we can validate whether that'll work together > with mesa/hw contexts. If it doesn't, not much point in bothering. > The simplest way is to just block in the ->do_execbuf callback if we can't submit > the new context right away. It'll suck a bit perf-wise, but will get the hw going. I think this is proved by now (QED!). > So essentially what I'd prefer is we keep all the existing ringbuffer code as-is, > and throw in a complete new set (with fairly new datastructures) in for > execlists. Then only interaction points would be: > - Ring init either calls into legacy ring init or new fancy execlist ring > init. > - Execbuf calls ring->do_submit with ring/engine, ctx object, batch state > and otherwise doesn't care one bit how it will all get submitted. That´s easy for execbufer, but what about the code that put things directly into the ringbuffer? I refer to constructions like: intel_ring_begin() intel_ring_emit() ... intel_ring_emit() intel_ring_advance() And also others like direct calls to i915_add_request outside the execbuffer path. > - Context state needs to be frobbed a bit so that we create the correct > backing object (i.e. legacy hw state or execlist ring+ctx). To make this > feasible it's probably best to switch the implicit per-fd ctx to be > per-ring. That way we still have the fixed hw-contxt->ring/engine > relationship and don't need to play tricks with lazy context allocation > (because those beats are so big with execlists). Sorry, I don´t get this: no more implicit per-fd ctx? so everybody uses the Aliasing PPGTT by default again? _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx