Hi, After some brainstorming, we decided to go big and grab all the hardware provided defaults for all registers. That's for the sake of not having to decide which registers are important. This will also allow us not to be concerned of any context registers possibly containing 'secure' information to be leaked, which we currently don't expect from any known userspace. But you never know what somebody could be doing with unused-for-them registers :) The fix is not quite stable material, containing changes to the hardware initialization ordering and being rather invasive in nature. We're currently discussing the uAPI to expose to userspace, please stay tuned. It'll likely be a bitmask of engine classes, which will shortly be introduced from the PMU work. Regards, Joonas On Wed, 2017-10-25 at 10:53 -0700, Jason Ekstrand wrote: > On Wed, Oct 25, 2017 at 10:31 AM, Kenneth Graunke <kenneth@xxxxxxxxxxxxx> wrote: > > On Wednesday, October 25, 2017 7:33:41 AM PDT Jason Ekstrand wrote: > > > On October 25, 2017 06:05:16 Joonas Lahtinen wrote: > > [snip] > > > > There indeed seems to be quite a lot of missing registers from the i915 > > > > driver where the context is initialized. (Psst. You can read that as: > > > > "all the 33 non-privileged registers we could quickly list, are > > > > missing"). > > > > > > We probably don't need *all* of them initialized. For instance, the > > > initial values of the ALU registers or the indirect draw parameter > > > registers will probably never matter. However, if you want to just > > > initialized them all, that's fine. > > > > I agree - I think we can cut down the list substantially, if you like. > > Here's my breakdown of Skylake's non-privileged register list: > > > > Cache_Mode_0 0x7000 > > Cache_Mode_1 0x7004 > > GT_MODE 0x7008 > > L3_Config 0x7034 > > TD_CTL 0xE400 > > TD_CTL2 0xE404 > > L3SQCREG4 0xB118 > > NOPID 0x2094 > > INSTPM 0x20C0 > > > > Should be initialized by the kernel. Several of these can severely > > break unsuspecting userspace, and we'd like to be able to rely on a > > default value. > > > > IA_VERTICES_COUNT 0x2310 > > IA_PRIMITIVES_COUNT 0x2318 > > VS_INVOCATION_COUNT 0x2320 > > HS_INVOCATION_COUNT 0x2300 > > DS_INVOCATION_COUNT 0x2308 > > GS_INVOCATION_COUNT 0x2328 > > GS_PRIMITIVES_COUNT 0x2330 > > SO_NUM_PRIMS_WRITTEN0 0x5200 > > SO_NUM_PRIMS_WRITTEN1 0x5208 > > SO_NUM_PRIMS_WRITTEN2 0x5210 > > SO_NUM_PRIMS_WRITTEN3 0x5218 > > SO_PRIM_STORAGE_NEEDED0 0x5240 > > SO_PRIM_STORAGE_NEEDED1 0x5248 > > SO_PRIM_STORAGE_NEEDED2 0x5250 > > SO_PRIM_STORAGE_NEEDED3 0x5258 > > CL_INVOCATION_COUNT 0x2338 > > CL_PRIMITIVES_COUNT 0x2340 > > PS_INVOCATION_COUNT_0 0x22C8 > > PS_DEPTH_COUNT_0 0x22D8 > > PS_INVOCATION_COUNT_1 0x22F0 > > PS_DEPTH_COUNT_1 0x22F8 > > PS_INVOCATION_COUNT_2 0x2448 > > PS_DEPTH_COUNT_2 0x2450 > > GPGPU_THREADS_DISPATCHED 0x2290 > > > > The kernel can skip these if you like. Statistics registers just count > > things, and userspace always calculates (end counter - start counter) > > deltas, so the initial value doesn't really matter. > > > > SO_WRITE_OFFSET0 0x5280 > > SO_WRITE_OFFSET1 0x5284 > > SO_WRITE_OFFSET2 0x5288 > > SO_WRITE_OFFSET3 0x528C > > GPUGPU_DISPATCHDIMX 0x2500 > > GPUGPU_DISPATCHDIMY 0x2504 > > GPUGPU_DISPATCHDIMZ 0x2508 > > MI_PREDICATE_SRC0 0x2400 > > MI_PREDICATE_SRC0 0x2404 > > MI_PREDICATE_SRC1 0x2408 > > MI_PREDICATE_SRC1 0x240C > > MI_PREDICATE_DATA 0x2410 > > MI_PREDICATE_DATA 0x2414 > > MI_PREDICATE_RESULT 0x2418 > > MI_PREDICATE_RESULT_1 0x241C > > MI_PREDICATE_RESULT_2 0x23BC > > 3DPRIM_END_OFFSET 0x2420 > > 3DPRIM_START_VERTEX 0x2430 > > 3DPRIM_VERTEX_COUNT 0x2434 > > 3DPRIM_INSTANCE_COUNT 0x2438 > > 3DPRIM_START_INSTANCE 0x243C > > 3DPRIM_BASE_VERTEX 0x2440 > > > > The kernel can skip these if you like, IMO. These registers are only > > used when enabling an optional feature - stream out (SO_WRITE_*), > > indirect compute dispatch (GPGPU_*), predicated draws (MI_PREDICATE_*), > > indirect draws (3DPRIM_*). Userspace has to explicitly opt in to each > > of these features by enabling a flag, so there isn't a cross-context > > contamination problem. If userspace opts in to these features, it can > > be responsible for programming the registers correctly. > > > > CS_GPR (1-16) 0x2600 > > > > The kernel can skip these if you like. They're temporary storage when > > using the MI_MATH instruction. Example usage: load values into CS_GPR1 > > and CS_GPR2, add them, store the result in CS_GPR3. Store to memory. > > > > Nobody should be doing math on register values without setting them. > > That's clearly a userspace bug. > > > > BB_OFFSET 0x2158 > > This is used for indirect BATCH_BUFFER_START which is a thing on SKL+ I believe (I didn't look at the docs). > > > OA_CTX_CONTROL 0x2360 > > OACTXID 0x2364 > > OA CONTROL 0x2B00 > > PERF_CNT_1_DW0 0x91b8 > > PERF_CNT_1_DW1 0x91bc > > PERF_CNT_2_DW0 0x91c0 > > PERF_CNT_2_DW1 0x91c4 > > > > I don't know about these. > > -- Joonas Lahtinen Open Source Technology Center Intel Corporation _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx