On Thu, Feb 09, 2017 at 09:02:06AM +0000, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > This removes the usage of intel_ring_emit in favour of > directly writing to the ring buffer. > > intel_ring_emit was preventing the compiler for optimising > fetch and increment of the current ring buffer pointer and > therefore generating very verbose code for every write. > > It had no useful purpose since all ringbuffer operations > are started and ended with intel_ring_begin and > intel_ring_advance respectively, with no bail out in the > middle possible, so it is fine to increment the tail in > intel_ring_begin and let the code manage the pointer > itself. > > Useless instruction removal amounts to approximately > two and half kilobytes of saved text on my build. > > Not sure if this has any measurable performance > implications but executing a ton of useless instructions > on fast paths cannot be good. > > Patch is not fully polished, but it compiles and runs > on Gen9 at least. > > v2: > * Change return from intel_ring_begin to error pointer by > popular demand. > * Move tail increment to intel_ring_advance to enable some > error checking. > > v3: > * Move tail advance back into intel_ring_begin. > * Rebase and tidy. > > v4: > * Complete rebase after a few months since v3. > > v5: > * Remove unecessary cast and fix !debug compile. (Chris Wilson) > > v6: > * Make intel_ring_offset take request as well. > * Fix recording of request postfix plus a sprinkle of asserts. > (Chris Wilson) > > v7: > * Use intel_ring_offset to get the postfix. (Chris Wilson) > * Convert GVT code as well. > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > Cc: Zhi Wang <zhi.a.wang@xxxxxxxxx> > --- > > Zhi, could you please check that they way I handled the GVT > code makes sense. > > I changed copy_gma_to_hva to return error or number of bytes > copied since that makes it easier for the new flavour of > intel_ring_begin and intel_ring_advance to work. > > I was only able to compile test it. > > Thanks, > > Tvrtko > --- > @@ -2608,36 +2605,33 @@ static int shadow_workload_ring_buffer(struct intel_vgpu_workload *workload) > gma_top = workload->rb_start + guest_rb_size; > > /* allocate shadow ring buffer */ > - ret = intel_ring_begin(workload->req, workload->rb_len / 4); > - if (ret) > - return ret; > + out = intel_ring_begin(workload->req, workload->rb_len / sizeof(u32)); > + if (IS_ERR(out)) > + return PTR_ERR(out); > > /* get shadow ring buffer va */ > - workload->shadow_ring_buffer_va = ring->vaddr + ring->tail; > + workload->shadow_ring_buffer_va = out; > > /* head > tail --> copy head <-> top */ > if (gma_head > gma_tail) { > ret = copy_gma_to_hva(vgpu, vgpu->gtt.ggtt_mm, > - gma_head, gma_top, > - workload->shadow_ring_buffer_va); > - if (ret) { > + gma_head, gma_top, out); > + if (ret < 0) { > gvt_err("fail to copy guest ring buffer\n"); > return ret; > } > - copy_len = gma_top - gma_head; > + out += ret / sizeof(u32); > gma_head = workload->rb_start; > } > > /* copy head or start <-> tail */ > - ret = copy_gma_to_hva(vgpu, vgpu->gtt.ggtt_mm, > - gma_head, gma_tail, > - workload->shadow_ring_buffer_va + copy_len); > - if (ret) { > + ret = copy_gma_to_hva(vgpu, vgpu->gtt.ggtt_mm, gma_head, gma_tail, out); I would have done out = copy_gma_to_have(... out); Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx