On 15/06/15 22:32, Chris Wilson wrote: > On Mon, Jun 15, 2015 at 07:36:27PM +0100, Dave Gordon wrote: >> +static struct drm_i915_gem_object *gem_allocate_guc_obj(struct drm_device *dev, >> + u32 size) >> +{ >> + struct drm_i915_gem_object *obj; >> + >> + obj = i915_gem_alloc_object(dev, size); >> + if (!obj) >> + return NULL; > > Does it need to be a shmemfs object? It needs to be permanently in RAM, so probably not. But I don't see an allocator that gives you a GEM memory object /without/ backing store. Do we have one? >> + if (i915_gem_object_get_pages(obj)) { >> + drm_gem_object_unreference(&obj->base); >> + return NULL; >> + } > > This is a random function call. Which is? Unreferencing the newly-allocated object before returning? Otherwise it will leak :( Presumably if the object didn't have backing store, the get_pages() would be unnecessary as they would already be resident? Or would they not exist until the first get_pages call instantiated them? >> + if (i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, >> + PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE)) { >> + drm_gem_object_unreference(&obj->base); >> + return NULL; > > How about reporting the right error code? Doesn't add anything. Allocation failure is going to be fatal anyway. And i915_gem_alloc_object() just returns NULL for failure, so we'd have to *make up* an error code for that case :( Oh, and ERR_PTR/PTR_ERR are ugly. >> + } >> + >> + return obj; >> +} >> + >> +/** >> + * gem_release_guc_obj() - Release gem object allocated for GuC usage >> + * @obj: gem obj to be released >> + */ >> +static void gem_release_guc_obj(struct drm_i915_gem_object *obj) >> +{ >> + if (!obj) >> + return; >> + >> + if (i915_gem_obj_is_pinned(obj)) >> + i915_gem_object_ggtt_unpin(obj); > > What? The object /should/ be pinned when we arrive here, thanks to the i915_gem_obj_ggtt_pin() call discussed above. We could just always unpin, and see whether we trigger this: WARN_ON(vma->pin_count == 0); inside i915_gem_object_ggtt_unpin(). The test is just in case there's an error path where the object being released wasn't in fact pinned. >> + drm_gem_object_unreference(&obj->base); >> +} >> + >> +/* >> + * Set up the memory resources to be shared with the GuC. At this point, >> + * we require just one object that can be mapped through the GGTT. >> + */ >> +int i915_guc_submission_init(struct drm_device *dev) >> +{ >> + struct drm_i915_private *dev_priv = dev->dev_private; > > Bleh. Cross-file interface, nonstatic, hence passes 'dev'; also it needs 'dev' anyway, so there's no gain in passing dev_priv. And dev_priv (i.e. struct drm_i915_private) isn't even in scope when this function is declared in the header file. >> + const size_t ctxsize = sizeof(struct guc_context_desc); >> + const size_t poolsize = MAX_GUC_GPU_CONTEXTS * ctxsize; >> + const size_t gemsize = round_up(poolsize, PAGE_SIZE); >> + struct intel_guc *guc = &dev_priv->guc; >> + >> + if (!i915.enable_guc_submission) >> + return 0; /* not enabled */ >> + >> + if (guc->ctx_pool_obj) >> + return 0; /* already allocated */ > > Eh? Where have you hooked into... So looking at that, it looks like you > want to move this into the device initialisation rather than guc > firmware load. To me at least they are conceptually separate stages, and > judging by the above combining them has resulted in very clumsy code. So ... we don't want to allocate any GuC-related structures unless we're going at least try to use the GuC, so this has to come /after/ firmware fetch and validation. OTOH we can't actually fire up the GuC until after these structures are allocated, because the GGTT address of the ctx_pool_obj has to be passed to the GuC firmware as one of its startup parameters. Thus, the only place to do this allocation is in between deciding to transfer the f/w to the GuC and actually doing so. Hence, the GuC loading code calls this each time we're about to squirt the f/w into the GuC; but, we don't need to allocate this more than once (OTOH it would be a violation of modularity for the loader to know that; only the submission code needs to know that little detail). So we end up with the above do-this-only-once code. >> + guc->ctx_pool_obj = gem_allocate_guc_obj(dev_priv->dev, gemsize); >> + if (!guc->ctx_pool_obj) >> + return -ENOMEM; >> + >> + spin_lock_init(&dev_priv->guc.host2guc_lock); >> + >> + ida_init(&guc->ctx_ids); >> + >> + memset(guc->doorbell_bitmap, 0, sizeof(guc->doorbell_bitmap)); >> + guc->db_cacheline = 0; > > Before you relied on guc being zeroed, and now you memset it again. Hmm ... perhaps we should rezero some of these things /every/ time instead? /me examines code ... Nope; it looks like everything is once again zero at the point when this function takes the early exit. >> + return 0; >> +} >> + >> +void i915_guc_submission_fini(struct drm_device *dev) >> +{ >> + struct drm_i915_private *dev_priv = dev->dev_private; >> + struct intel_guc *guc = &dev_priv->guc; >> + >> + gem_release_guc_obj(dev_priv->guc.log_obj); >> + guc->log_obj = NULL; >> + >> + if (guc->ctx_pool_obj) >> + ida_destroy(&guc->ctx_ids); > > Interesting guard. Maybe just make the GuC controller a pointer from > i915 and then you can do a more natural if (i915->guc == NULL) return; That test was because there's no way to tell from ctx_ids itself whether it was initialised (in any case, it's a black box as far as I'm concerned). But the init code above guarantees that iff the pool was allocated, then the rest of the initialisation was also done, so we should call ida_destroy(). I wouldn't object to changing the intel_guc to a separate allocation rather than embedding it. We'd need to add a backpointer though as we currently use container_of() inside the guc_to_i915 macro. But you'd still end up with something like the above, because the allocation of the ctx_pool is still a separate step that can fail after the intel_guc structure has been allocated; and you need the f/w-loading-related data very early. The only saving is a small amount of memory, for an cost of extra memory dereference at various points. So probably not worth it. >> + gem_release_guc_obj(guc->ctx_pool_obj); >> + guc->ctx_pool_obj = NULL; >> +} >> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h >> index 0b44265..06b68c2 100644 >> --- a/drivers/gpu/drm/i915/intel_guc.h >> +++ b/drivers/gpu/drm/i915/intel_guc.h >> @@ -171,4 +171,8 @@ extern void intel_guc_ucode_init(struct drm_device *dev); >> extern int intel_guc_ucode_load(struct drm_device *dev, bool wait); >> extern void intel_guc_ucode_fini(struct drm_device *dev); >> >> +/* i915_guc_submission.c */ >> +int i915_guc_submission_init(struct drm_device *dev); >> +void i915_guc_submission_fini(struct drm_device *dev); >> + >> #endif >> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c >> index 16eef4c..0f74876 100644 >> --- a/drivers/gpu/drm/i915/intel_guc_loader.c >> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c >> @@ -111,6 +111,21 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv) >> i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT; >> } >> >> + /* If GuC scheduling is enabled, setup params here. */ >> + if (i915.enable_guc_submission) { >> + u32 pgs = i915_gem_obj_ggtt_offset(dev_priv->guc.ctx_pool_obj); >> + u32 ctx_in_16 = MAX_GUC_GPU_CONTEXTS / 16; > > So really you didn't need to pin the ctx_pool_obj until this point? Possibly. But that's not long after the allocation above (it's called from the next function that the caller of i915_guc_submission_init() calls after a successful return from that function). intel_guc_ucode_load() i915_guc_submission_init() gem_allocate_guc_obj() -- returns pinned object guc_ucode_xfer() set_guc_init_params() -- needs GGTT address of pinned object And we really don't want any extra failure paths at this depth. Better to pin the object early and bail out if there's a problem. Its going to be pinned for its entire lifetime anyway, so I don't think there's a problem with pinning it a few microseconds early, especially /during first open/ when there's no contention for GGTT space. >> + pgs >>= PAGE_SHIFT; >> + params[GUC_CTL_CTXINFO] = (pgs << GUC_CTL_BASE_ADDR_SHIFT) | >> + (ctx_in_16 << GUC_CTL_CTXNUM_IN16_SHIFT); >> + >> + params[GUC_CTL_FEATURE] |= GUC_CTL_KERNEL_SUBMISSIONS; >> + >> + /* Unmask this bit to enable GuC scheduler */ >> + params[GUC_CTL_FEATURE] &= ~GUC_CTL_DISABLE_SCHEDULER; This line deserves the comment firstly because we explicitly set this bit earlier in the function, but have now decided to clear it again (it was tidier than having unbalanced if-else legs); and secondly to help people not get confused by the number of negations (&= ~x means clear something, but what we're clearing has negative semantics "disable"). So it does convey our intent ("why? to enable the GuC scheduler") as well as how ("*un*mask this bit"). [aside] At least the GuC parameter semantics are not as ugly as some workarounds in the BSpec, where I regularly find things such as "this workaround disables feature A when using option B but need not be applied if condition C holds unless condition D is false or feature E is disabled. The workaround must not be applied in mode F." *Bleeuurgh* [/aside] > /* Enable multiple context submission through the GuC */ > params[GUC_CTL_FEATURE] &= ~GUC_CTL_DISABLE_SCHEDULER; > params[GUC_CTL_FEATURE] |= GUC_CTL_KERNEL_SUBMISSIONS; > > Try to keep comments to explain why rather than what. Most of the > comments here fall into the "i++; // postincrement i" category. > -Chris Most of the "what" comments in this file are associated with accesses to specific h/w registers, which therefore have semantic effect beyond what is explicit in the code. For example this comment: /* tell all command streamers to forward interrupts and vblank to GuC */ irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK,GFX_FORWARD_VBLANK_ALWAYS); irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING); for_each_ring(ring, dev_priv, i) I915_WRITE(RING_MODE_GEN7(ring), irqs); helps the reader what the /effect/ of the writes is intended to be. It's quite different from: /* write bitmask to GEN7 ring mode register */ I915_WRITE(RING_MODE_GEN7(ring),MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING)); and means you may not have to dig through the BSpec to find out what the less helpfully-named bits actually do. And this: I915_WRITE(DE_GUCRMR, ~0); would be incomprehensible without reading the BSpec ... or the comment /* tell DE to send nothing to GuC */ .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx