Re: [bug report] drm/i915/guc: Implement multi-lrc submission

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/22/2022 07:26, Dan Carpenter wrote:
Hello Matthew Brost,

The patch 6b540bf6f143: "drm/i915/guc: Implement multi-lrc
submission" from Oct 14, 2021, leads to the following Smatch static
checker warning:

	drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:752 __guc_add_request()
	warn: refcount leak 'ce->ref.refcount.refs.counter': lines='752'

drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
     672 static int __guc_add_request(struct intel_guc *guc, struct i915_request *rq)
     673 {
     674         int err = 0;
     675         struct intel_context *ce = request_to_scheduling_context(rq);
     676         u32 action[3];
     677         int len = 0;
     678         u32 g2h_len_dw = 0;
     679         bool enabled;
     680
     681         lockdep_assert_held(&rq->engine->sched_engine->lock);
     682
     683         /*
     684          * Corner case where requests were sitting in the priority list or a
     685          * request resubmitted after the context was banned.
     686          */
     687         if (unlikely(intel_context_is_banned(ce))) {
     688                 i915_request_put(i915_request_mark_eio(rq));
     689                 intel_engine_signal_breadcrumbs(ce->engine);
     690                 return 0;
     691         }
     692
     693         GEM_BUG_ON(!atomic_read(&ce->guc_id.ref));
     694         GEM_BUG_ON(context_guc_id_invalid(ce));
     695
     696         if (context_policy_required(ce)) {
     697                 err = guc_context_policy_init_v70(ce, false);
     698                 if (err)
     699                         return err;
     700         }
     701
     702         spin_lock(&ce->guc_state.lock);
     703
     704         /*
     705          * The request / context will be run on the hardware when scheduling
     706          * gets enabled in the unblock. For multi-lrc we still submit the
     707          * context to move the LRC tails.
     708          */
     709         if (unlikely(context_blocked(ce) && !intel_context_is_parent(ce)))
     710                 goto out;
     711
     712         enabled = context_enabled(ce) || context_blocked(ce);
     713
     714         if (!enabled) {
     715                 action[len++] = INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET;
     716                 action[len++] = ce->guc_id.id;
     717                 action[len++] = GUC_CONTEXT_ENABLE;
     718                 set_context_pending_enable(ce);
     719                 intel_context_get(ce);
     720                 g2h_len_dw = G2H_LEN_DW_SCHED_CONTEXT_MODE_SET;
     721         } else {
     722                 action[len++] = INTEL_GUC_ACTION_SCHED_CONTEXT;
     723                 action[len++] = ce->guc_id.id;
     724         }
     725
     726         err = intel_guc_send_nb(guc, action, len, g2h_len_dw);
     727         if (!enabled && !err) {
     728                 trace_intel_context_sched_enable(ce);
     729                 atomic_inc(&guc->outstanding_submission_g2h);
     730                 set_context_enabled(ce);
     731
     732                 /*
     733                  * Without multi-lrc KMD does the submission step (moving the
     734                  * lrc tail) so enabling scheduling is sufficient to submit the
     735                  * context. This isn't the case in multi-lrc submission as the
     736                  * GuC needs to move the tails, hence the need for another H2G
     737                  * to submit a multi-lrc context after enabling scheduling.
     738                  */
     739                 if (intel_context_is_parent(ce)) {
     740                         action[0] = INTEL_GUC_ACTION_SCHED_CONTEXT;
     741                         err = intel_guc_send_nb(guc, action, len - 1, 0);

Should this have a something like:

				if (err)
					intel_context_put(ce);
No.

The ce has been marked as enabled above because the actual enable call succeeded.  This is a secondary call to 'poke' the context. If this fails, the context might not get scheduled in a timely manner but the tracking state is still correct. The context is now in use by GuC and therefore must be reference locked. And the 'context_enabled' flag is set on the i915 side to note that the reference count is held.

If you want to unwind at this point, you would need to send a CONTEXT_MODE_SET(DISABLE) H2G message (amongst other things) and only once that call has completed, can you call intel_context_put(ce).

I don't get why the analyser would be claiming a leak here. I'm not sure if it is just that the analyser is confused or if there is some other potential route to a leak. Is it possible to get more details as to how it thinks the leak can occur?

John.


     742                 }
     743         } else if (!enabled) {
     744                 clr_context_pending_enable(ce);
     745                 intel_context_put(ce);
     746         }
     747         if (likely(!err))
     748                 trace_i915_request_guc_submit(rq);
     749
     750 out:
     751         spin_unlock(&ce->guc_state.lock);
--> 752         return err;
     753 }

regards,
dan carpenter




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux