On 2023-10-27 12:31, Boris Brezillon wrote: > On Fri, 27 Oct 2023 16:23:24 +0200 > Danilo Krummrich <dakr@xxxxxxxxxx> wrote: > >> On 10/27/23 10:25, Boris Brezillon wrote: >>> Hi Danilo, >>> >>> On Thu, 26 Oct 2023 18:13:00 +0200 >>> Danilo Krummrich <dakr@xxxxxxxxxx> wrote: >>> >>>> Currently, job flow control is implemented simply by limiting the number >>>> of jobs in flight. Therefore, a scheduler is initialized with a credit >>>> limit that corresponds to the number of jobs which can be sent to the >>>> hardware. >>>> >>>> This implies that for each job, drivers need to account for the maximum >>>> job size possible in order to not overflow the ring buffer. >>>> >>>> However, there are drivers, such as Nouveau, where the job size has a >>>> rather large range. For such drivers it can easily happen that job >>>> submissions not even filling the ring by 1% can block subsequent >>>> submissions, which, in the worst case, can lead to the ring run dry. >>>> >>>> In order to overcome this issue, allow for tracking the actual job size >>>> instead of the number of jobs. Therefore, add a field to track a job's >>>> credit count, which represents the number of credits a job contributes >>>> to the scheduler's credit limit. >>>> >>>> Signed-off-by: Danilo Krummrich <dakr@xxxxxxxxxx> >>>> --- >>>> Changes in V2: >>>> ============== >>>> - fixed up influence on scheduling fairness due to consideration of a job's >>>> size >>>> - If we reach a ready entity in drm_sched_select_entity() but can't actually >>>> queue a job from it due to size limitations, just give up and go to sleep >>>> until woken up due to a pending job finishing, rather than continue to try >>>> other entities. >>>> - added a callback to dynamically update a job's credits (Boris) >>> >>> This callback seems controversial. I'd suggest dropping it, so the >>> patch can be merged. >> >> I don't think we should drop it just for the sake of moving forward. If there are objections >> we'll discuss them. I've seen good reasons why the drivers you are working on require this, >> while, following the discussion, I have *not* seen any reasons to drop it. It helps simplifying >> driver code and doesn't introduce any complexity or overhead to existing drivers. > > Up to you. I'm just saying, moving one step in the right direction is > better than being stuck, and it's not like adding this callback in a > follow-up patch is super complicated either. If you're confident that > we can get all parties to agree on keeping this hook, fine by me. I'd rather have it in now, as it is really *the vision* of this patch. There's no point in pushing in something half-baked. -- Regards, Luben
Attachment:
OpenPGP_0x4C15479431A334AF.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature