Quoting Mika Kuoppala (2019-09-24 11:21:38) > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > Before we submit the first context to HW, we need to construct a valid > > image of the register state. This layout is defined by the HW and should > > match the layout generated by HW when it saves the context image. > > Asserting that this should be equivalent should help avoid any undefined > > behaviour and verify that we haven't missed anything important! > > > > Of course, having insisted that the initial register state within the > > LRC should match that returned by HW, we need to ensure that it does. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> > > +static u32 *set_offsets(u32 *regs, > > + const u8 *data, > > + const struct intel_engine_cs *engine) > > +#define NOP(x) (BIT(7) | (x)) > > +#define LRI(count, flags) ((flags) << 6 | (count)) > > +#define POSTED BIT(0) > > +#define REG(x) (((x) >> 2) | BUILD_BUG_ON_ZERO(x >= 0x200)) > > +#define REG16(x) \ > > + (((x) >> 9) | BIT(7) | BUILD_BUG_ON_ZERO(x >= 0x10000)), \ > > + (((x) >> 2) & 0x7f) > > I am still not sure if the actual saving are worth the complexity. > > > +#define END() 0 > > +{ > > + const u32 base = engine->mmio_base; > > + > > + while (*data) { > > + u8 count, flags; > > + > > + if (*data & BIT(7)) { /* skip */ > > + regs += *data++ & ~BIT(7); > > + continue; > > + } > > + > > + count = *data & 0x3f; > > + flags = *data >> 6; > > + data++; > > + > > + *regs = MI_LOAD_REGISTER_IMM(count); > > + if (flags & POSTED) > > + *regs |= MI_LRI_FORCE_POSTED; > > + if (INTEL_GEN(engine->i915) >= 11) > > + *regs |= MI_LRI_CS_MMIO; > > + regs++; > > + > > + GEM_BUG_ON(!count); > > + do { > > + u32 offset = 0; > > + u8 v; > > + > > + do { > > + v = *data++; > > + offset <<= 7; > > + offset |= v & ~BIT(7); > > + } while (v & BIT(7)); > > ...but perhaps this amount of extra can be tolerated. > > Did you check how this would play out with just REG being wide enough? When I started, I thought we could get away with only one REG16. Looking at the context image I think we might want a few non engine->mmio_base regs in there (if I read it right, some of the 0x4000 range are per engine). That will need a slightly different encoding as well :| No, I haven't but since you ask, I shall. > > + > > + *regs = base + (offset << 2); > > In here reader is yearning for an asserts of not trampling > on wrong territory. If you have an idea for a good assert, go for it :) What range should be checked. offset < 0x1000 ? > But I would guess that you want this part to be like > oiled lightning and test the machinery with selftest..as the > subject seems to promise. The importance is certainly placed on having a selftest and the confidence in keeping our offsets in line with the HW. The goal was to have a compact description for the register offsets, in terms of readability I think the emphasis should be on the tables (gen8_xcs_offsets[]). > > @@ -3092,7 +3451,7 @@ void intel_execlists_set_default_submission(struct intel_engine_cs *engine) > > engine->flags |= I915_ENGINE_HAS_PREEMPTION; > > } > > > > - if (engine->class != COPY_ENGINE_CLASS && INTEL_GEN(engine->i915) >= 12) > > + if (engine->class != COPY_ENGINE_CLASS && INTEL_GEN(engine->i915) >= 11) > > engine->flags |= I915_ENGINE_HAS_RELATIVE_MMIO; > > Ok, first I thought this was unintentional. But prolly not. > Do you need it for the verifier to work? No, I ended up completely ignoring this flag as the HW does not differentiate between engines. On gen11+, it sets the LRI flag everywhere in the context image. > Could we still rip it out to be a first in the series. > Just would want to differiante possible icl hickups apart > from this patch. Sure. > > +static int live_lrc_layout(void *arg) > > +{ > > + struct intel_gt *gt = arg; > > + struct intel_engine_cs *engine; > > + enum intel_engine_id id; > > + u32 *mem; > > + int err; > > + > > + /* > > + * Check the registers offsets we use to create the initial reg state > > + * match the layout saved by HW. > > + */ > > + > > + mem = kmalloc(PAGE_SIZE, GFP_KERNEL); > > + if (!mem) > > + return -ENOMEM; > > + > > + err = 0; > > + for_each_engine(engine, gt->i915, id) { > > + u32 *hw, *lrc; > > + int dw; > > + > > + if (!engine->default_state) > > + continue; > > + > > + hw = i915_gem_object_pin_map(engine->default_state, > > + I915_MAP_WB); > > This default state is not pristine as we have trampled > it with our first submission, right? It is the context image saved after the first request. > But being succeeded at doing so, the next context > save should overwrite our trampling and it would > then represent the hw accurate context save > state. > > Against which we will compare of our reg state > writer. Right, default_state is the HW version of our init_reg_state. > > + if (IS_ERR(hw)) { > > + err = PTR_ERR(hw); > > + break; > > + } > > + hw += LRC_STATE_PN * PAGE_SIZE / sizeof(*hw); > > + > > + lrc = memset(mem, 0, PAGE_SIZE); > > + execlists_init_reg_state(lrc, > > + engine->kernel_context, > > + engine, > > + engine->kernel_context->ring, > > + true); > > + > > + dw = 0; > > + do { > > + u32 lri = hw[dw]; > > + > > + if (lri == 0) { > > + dw++; > > + continue; > > + } > > + > > + if ((lri & GENMASK(31, 23)) != MI_INSTR(0x22, 0)) { > > + pr_err("%s: Expected LRI command at dword %d, found %08x\n", > > + engine->name, dw, lri); > > + err = -EINVAL; > > + break; > > + } > > + > > + if (lrc[dw] != lri) { > > + pr_err("%s: LRI command mismatch at dword %d, expected %08x found %08x\n", > > + engine->name, dw, lri, lrc[dw]); > > + err = -EINVAL; > > + break; > > + } > > + > > + lri &= 0x7f; > > + lri++; > > + dw++; > > + > > + while (lri) { > > + if (hw[dw] != lrc[dw]) { > > + pr_err("%s: Different registers found at dword %d, expected %x, found %x\n", > > + engine->name, dw, hw[dw], lrc[dw]); > > + err = -EINVAL; > > + break; > > + } > > + > > + /* > > + * Skip over the actual register value as we > > + * expect that to differ. > > + */ > > + dw += 2; > > + lri -= 2; > > This makes me wonder if we could use this machinery post hang. Just to > get a little more triage data out, ie 'your context looks corrupted at > offset %x'... Certainly possible, but what we check here is _mostly_ the privileged registers that are not really meant to be changed by the user -- and we are only checking the offsets, so unlikely there to be just one wrong. The general principle was that we should provide raw information and have the smarts in userspace (so that we could always enhance our processing and reanalyse existing dumps). But at the end of the day, whatever allows us to prevent bugs or fix bugs is paramount. But I'm not yet sold this helps. Maybe if we find an example where it proves useful... > > + } > > + } while ((lrc[dw] & ~BIT(0)) != MI_BATCH_BUFFER_END); > > Ok, you tie up always the generate image. For future work add the hw batch > endpoint be a part of checker? It's not always in the first page, I'm not even sure if a BB_END is always included in the older gen. (I have a feeling the HW definitely started including it ~gen10.) -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx