On Fri, 2022-09-16 at 08:36 -0700, Ceraolo Spurio, Daniele wrote: > > On 9/16/2022 1:58 AM, Tvrtko Ursulin wrote: > > On 16/09/2022 08:53, Teres Alexis, Alan Previn wrote: > > > On Thu, 2022-09-15 at 09:48 +0100, Tvrtko Ursulin wrote: > > > > On 15/09/2022 03:12, Alan Previn wrote: > > > > > From: Matthew Brost <matthew.brost@xxxxxxxxx> > > > > > > > > > > +static void guc_context_sched_disable(struct intel_context *ce) > > > > > +{ > > > > > + struct intel_guc *guc = ce_to_guc(ce); > > > > > + u64 delay = guc->submission_state.sched_disable_delay_ms; > > > > > + unsigned long flags; > > > > > + > > > > > + spin_lock_irqsave(&ce->guc_state.lock, flags); > > > > > + > > > > > + if (bypass_sched_disable(guc, ce)) { > > > > > + spin_unlock_irqrestore(&ce->guc_state.lock, flags); > > > > > + intel_context_sched_disable_unpin(ce); > > > > > + } else if (!intel_context_is_closed(ce) && > > > > > !guc_id_pressure(guc, ce) && > > > > > + delay) { > > > > > spin_unlock_irqrestore(&ce->guc_state.lock, flags); > > > > > - goto unpin; > > > > > + mod_delayed_work(system_unbound_wq, > > > > > + &ce->guc_state.sched_disable_delay, > > > > > + msecs_to_jiffies(delay)); > > > > > + } else { > > > > > + do_sched_disable(guc, ce, flags); > > > > > } > > > > > > > > lock > > > > if > > > > unlock > > > > do stuff > > > > else if > > > > unlock > > > > do stuff > > > > else > > > > do_sched_disable - which unlocks inside > > > > > > > > IMO it creates less readable code for the benefit of not repeating > > > > with_intel_runtime_pm -> __guc_context_sched_disable two times. Dunno.. > > > > it's ugly but I have no suggestions. Hm does it have to send using the > > > > busy loop? It couldn't just queue the request and then wait for > > > > reply if > > > > disable message was emitted? > > > > > > > I agree that the above code lacks readability - will see if i can > > > break it down to smaller > > > functions with cleaner in-function lock/unlock pairs. I agree that a > > > little code duplication > > > is better than less readable code. It was inherited code i didn't > > > want to modify but I'll > > > go ahead and refactor this. > > > > > > On the busy loop - im assuming you are refering to the actual ct > > > sending. I'll consult my > > > team if i am missing anything more but based on comments, I believe > > > callers must use that > > > function to guarantee reservation of space in the G2H CTB to always > > > have space to capture > > > responses for actions that MUST be acknowledged from GuC > > > (acknowledged by either replying > > > with a success or failure). This is necessary for coherent guc-id > > > state machine (because the > > > GuC firmware will drop requests for guc-id's that are not registered > > > or not in a > > > 'sched-enabled' state). > > > > Maybe to explain what I meant a bit better. I thought that the reason > > for strange unlock pattern is the with_runtime_pm which needs to > > happen for the CT send/receive loop. What I meant was would it be > > possible to do it like this: > > Alan: I assumed the strange lock-unlock was for no other reason than to ensure that once the delayed task begins executing, we dont let go of the lock until after we complete the call prep_context_pending_disable (which modifies the state-machine for the context's guc-id). And as we expect, the runtime_pm is about trying to send the H2G signal to the GuC which touches register. (i.e. lock was on ce-gucid state and power was for guc-interaction via ct). If my assumption are correct, then I can refactor the functions a bit to remove that weird unlock flow while keeping the lock held from start until prep_context_pending_disable, else i could still go ahead and do that (with some minimal duplication). > > state lock > > ... > > sent = queue_msg_2_guc (this would be i915 only, no runtime pm needed) > > ... > > state unlock > > > > if (sent) > > with_runtime_pm: > > send/receive queued guc messages (only this would talk to guc) > > > > But I have truly no idea if the CT messaging infrastructure supports > > something like this. > > Alan: I'll look around and see if we have something that provides that flow but i don't think I should pursue such a redesign as part of this series if none exists. > > Anyway, see what it is possible and if it is not or too hard for now > > leave it be. > > > > > > > - guc_id = prep_context_pending_disable(ce); > > > > > +} > > > > > - spin_unlock_irqrestore(&ce->guc_state.lock, flags); > > > > > +static void guc_flush_all_delayed_disable_sched_contexts(struct > > > > > intel_guc *guc) > > > > > +{ > > > > > + struct intel_context *ce; > > > > > + unsigned long index; > > > > > + unsigned long flags; > > > > > + unsigned long ceflags; > > > > > - with_intel_runtime_pm(runtime_pm, wakeref) > > > > > - __guc_context_sched_disable(guc, ce, guc_id); > > > > > + xa_lock_irqsave(&guc->context_lookup, flags); > > > > > + xa_for_each(&guc->context_lookup, index, ce) { > > > > > + if (!kref_get_unless_zero(&ce->ref)) > > > > > + continue; > > > > > + xa_unlock(&guc->context_lookup); > > > > > > > > So this whole loop _needs_ to run with interrupts disabled? Explaining > > > > why in a comment would be good. > > > > > > > Being mid-reset, the locking mode is consistent with other functions > > > also used > > > as part of the reset preparation that parses and potentially modifies > > > contexts. > > > I believe the goal is to handle all of this parsing without getting > > > conflicting > > > latent G2H replies that breaks the preparation flow (that herds > > > active contexts > > > into a fewer set of states as part of reset) - but i will double check > > > with my colleagues. > > > > > > > > + if (test_bit(CONTEXT_GUC_INIT, &ce->flags) && > > > > > + cancel_delayed_work(&ce->guc_state.sched_disable_delay)) { > > > > > + spin_lock_irqsave(&ce->guc_state.lock, ceflags); > > > > > + spin_unlock_irqrestore(&ce->guc_state.lock, ceflags); > > > > > > > > This deserves a comment about what lock toggling wants to ensure. > > > > > > > I realize this might have been my local rebasing mistake, the > > > intention was to > > > handle cases where sched_disable_delay wasn't pending but potentially > > > still > > > executing do_sched_disable. I believe I could try > > > cancel_delayed_work_sync (but > > > not sure if i can call that might-sleep funtion mid reset while not- > > > interruptible). Else, i would move that lock-unlock to if the > > > cancel_delayed_work did not return true (as per original intent > > > before my > > > rebase error). > > > > Okay a comment like "flush any currently executing do_sched_disable" > > above the lock toggling would do then. Even better if you add what > > happens if that flushing isn't done. > > got it. will do. > > > > Also, if the loops runs with interrupts disabled what is the point of > > > > irqsave variant in here?? > > > Yes - its redundant, let me fix that, apologies for that. > > > > > > > Also2, what is the reason for dropping the lock? intel_context_put? > > > Being consistent with other reset preparation code that closes contexts, > > > the lock is dropped before the intel_context_put. > > > (I hope i am not misunderstanding your question). > > > > Right, okay.. so for locking inversion issues - intel_context_put must > > not be called with guc_state.lock held, yes? > > > > Not a mandatory request, but if you want you could look into the > > option of avoiding lock dropping and instead doing xa_erase and > > recording the list of contexts for sched disable or put in a local > > list, and doing them all in one block outside the locked/irq disabled > > section. Although tbh I am not sure if that would improve anything. > > Probably not in this case of a reset path, but if there are other > > places in GuC code which do this it may be beneficial for less > > hammering on the CPU and core synchronisation for atomics. > > > > > > > + /* > > > > > + * If the context gets closed while the execbuf is ongoing, > > > > > the context > > > > > + * close code will race with the below code to cancel the > > > > > delayed work. > > > > > + * If the context close wins the race and cancels the work, it > > > > > will > > > > > + * immediately call the sched disable (see guc_context_close), > > > > > so there > > > > > + * is a chance we can get past this check while the > > > > > sched_disable code > > > > > + * is being executed. To make sure that code completes before > > > > > we check > > > > > + * the status further down, we wait for the close process to > > > > > complete. > > > > > + */ > > > > > + if (cancel_delayed_work_sync(&ce->guc_state.sched_disable_delay)) > > > > > + intel_context_sched_disable_unpin(ce); > > > > > + else if (intel_context_is_closed(ce)) > > > > > + wait_for(context_close_done(ce), 1); > > > > > > > > Comment makes it sounds important to handle the race, althought it > > > > doesn't really explain the consequences. But most importantly, what if > > > > close doesn't complete in 1ms? > > > > > > will add the consequence (i believe the consequence is that we could > > > prematurely > > > read context flags bit indicating its gucid is still registered and > > > after skipping > > > re-registration, find that context gets closed and guc-id freed). > > > > > > Yes the 1 second is arbitrary and unnervingly short. Just spent > > > sometime trying to > > > > One millisecond even. :) > hehe - typo :) > Normally 1ms is not a slow time for this. We can only hit the wait if > the context_close call has already called the cancel_delayed_work, so > the only thing left to do in that function is to send the H2G, which is > relatively fast. However, I've just realized that if the H2G buffer is > full the H2G will stall, so in that case it can take longer (maximum > stall time before a hang is declared is 1.5s). > Daniele, so increase to 1500 milisecs? > > What would be the consequence of prematurely reading the still > > registered context flag? Execbuf fails? GuC hangs? Kernel crashes? > > Something else? > > i915 thinks that a request pending with the GuC, while GuC thinks we > disabled it (because the completion of the delayed_disable happens after > the new request has been submitted). The heartbeat will still go > through, so no reset will be triggered, the request is just lost. A new > request submission on the same context should be able to recover it, but > we didn't test that. > > > > And why cant' this race with context close happen at any other point > > than this particular spot in guc_request_alloc? Immediately after the > > added checks? After atomic_add_unless? > > The race is tied to the canceling of the delayed work. context_close > only does something if it cancels the delayed work, so if the > cancel_delayed_work_sync here does the cancelling then context_close is > a no-op. > > > > figure out portions of the SCHED_foo state machine bits and believe > > > that its possible > > > for guc_request_alloc to just force context_close to be done from > > > here as it would > > > force it into a state requiring re-registration and would close that > > > a few lines > > > below. I will however verify with my team mates as i am new to these > > > SCHED_foo state > > > machine bits. > > I'm not sure we want to force context_close unconditionally here, that'd > be a big overhead. Doing it only if there is a close pending is also > potentially an issue, the whole point is that the close can race in. The > close also has to work on its own, because in the normal use-cases we > don't get a context_close while a request submission is ongoing. > Unless you meant something different and I completely misunderstood. > > Daniele > > > Yes it always looked complicated and perhaps it has even grown more so > > - I'm afraid I cannot be of any help there. > > > > Regards, > > > > Tvrtko