On Tue, Apr 25, 2017 at 12:13:04PM +0100, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > Tool which emits batch buffers to engines with configurable > sequences, durations, contexts, dependencies and userspace waits. > > Unfinished but shows promise so sending out for early feedback. > > v2: > * Load workload descriptors from files. (also -w) > * Help text. > * Calibration control if needed. (-t) > * NORELOC | LUT to eb flags. > * Added sample workload to wsim/workload1. > > v3: > * Multiple parallel different workloads (-w -w ...). > * Multi-context workloads. > * Variable (random) batch length. > * Load balancing (round robin and queue depth estimation). > * Workloads delays and explicit sync steps. > * Workload frequency (period) control. > > v4: > * Fixed queue-depth estimation by creating separate batches > per engine when qd load balancing is on. > * Dropped separate -s cmd line option. It can turn itself on > automatically when needed. > * Keep a single status page and lie about the write hazard > as suggested by Chris. > * Use batch_start_offset for controlling the batch duration. > (Chris) > * Set status page object cache level. (Chris) > * Moved workload description to a README. > * Tidied example workloads. > * Some other cleanups and refactorings. > > v5: > * Master and background workloads (-W / -w). > * Single batch per step is enough even when balancing. (Chris) > * Use hars_petruska_f54_1_random IGT functions and see to zero > at start. (Chris) > * Use WC cache domain when WC mapping. (Chris) > * Keep seqnos 64-bytes apart in the status page. (Chris) > * Add workload throttling and queue-depth throttling commands. > (Chris) > > v6: > * Added two more workloads. > * Merged RT balancer from Chris. > > TODO list: * No reloc! * bb caching/reuse > * Fence support. > * Better error handling. > * Less 1980's workload parsing. > * More workloads. > * Threads? > * ... ? > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@xxxxxxxxx> > --- > +static enum intel_engine_id > +rt_balance(const struct workload_balancer *balancer, > + struct workload *wrk, struct w_step *w) > +{ > + enum intel_engine_id engine; > + long qd[NUM_ENGINES]; > + unsigned int n; > + > + igt_assert(w->engine == VCS); > + > + /* Estimate the "speed" of the most recent batch > + * (finish time - submit time) > + * and use that as an approximate for the total remaining time for > + * all batches on that engine. We try to keep the total remaining > + * balanced between the engines. > + */ Next steps for this would be to move from an instantaneous speed, to an average. I'm thinking something like a exponential decay moving average just to make the estimation more robust. > + if (qd_throttle > 0 && balancer && balancer->get_qd) { > + unsigned int target; > + > + for (target = wrk->nr_steps - 1; target > 0; > + target--) { I think this should skip other engines. if (target->engine != engine) continue; > + if (balancer->get_qd(balancer, wrk, > + engine) < > + qd_throttle) > + break; > + w_sync_to(wrk, w, i - target); > + } -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx