On Sun, Feb 26, 2017 at 02:41:50PM -0800, Chad Versace wrote: > On Wed 04 Jan 2017, ville.syrjala@xxxxxxxxxxxxxxx wrote: > > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > SKL+ display engine can scan out certain kinds of compressed surfaces > > produced by the render engine. This involved telling the display engine > > the location of the color control surfae (CCS) which describes which > > parts of the main surface are compressed and which are not. The location > > of CCS is provided by userspace as just another plane with its own offset. > > > > By providing our own format information for the CCS formats, we should > > be able to make framebuffer_check() do the right thing for the CCS > > surface as well. > > > > Note that we'll return the same format info for both Y and Yf tiled > > format as that's what happens with the non-CCS Y vs. Yf as well. If > > desired, we could potentially return a unique pointer for each > > pixel_format+tiling+ccs combination, in which case we immediately be > > able to tell if any of that stuff changed by just comparing the > > pointers. But that does sound a bit wasteful space wise. > > > > v2: Drop the 'dev' argument from the hook > > v3: Include the description of the CCS surface layout > > > > Cc: Vandana Kannan <vandana.kannan@xxxxxxxxx> > > Cc: Daniel Vetter <daniel@xxxxxxxx> > > Cc: Ben Widawsky <ben@xxxxxxxxxxxx> > > Cc: Jason Ekstrand <jason@xxxxxxxxxxxxxx> > > Reviewed-by: Ben Widawsky <ben@xxxxxxxxxxxx> > > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/intel_display.c | 36 ++++++++++++++++++++++++++ > > include/uapi/drm/drm_fourcc.h | 49 ++++++++++++++++++++++++++++++++++++ > > 2 files changed, 85 insertions(+) > > > > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h > > index 9e1bb7fabcde..4581e3d41e5c 100644 > > --- a/include/uapi/drm/drm_fourcc.h > > +++ b/include/uapi/drm/drm_fourcc.h > > @@ -230,6 +230,55 @@ extern "C" { > > #define I915_FORMAT_MOD_Yf_TILED fourcc_mod_code(INTEL, 3) > > > > /* > > + * Intel color control surface (CCS) for render compression > > + * > > + * The framebuffer format must be one of the 8:8:8:8 RGB formats, > > + * the main surface will be plane index 0 and will be Y/Yf-tiled, > > + * the CCS will be plane index 1. > > + * > > The paragraph below is... > > > + * Each byte of CCS contains 4 pairs of bits, with each pair of bits > > + * matching an area of 8x4 pixels of the main surface. Which would seem > > + * to match 2 cachelines containing 4x4 pixels each. The pairs bits within > > + * the byte form a 2x2 grid, which thus matches a 16x8 pixel area of the > > + * main surface. This is the 2x2 pattern the bits form (0=lsb, 7=msb): > > + * ----------- > > + * | 01 | 23 | > > + * ---------- > > + * | 45 | 67 | > > + * ----------- > > ...almost correct. The hw docs state that each bit-pair of the CCS maps > cacheline-pair, horizontally adjacent in the Y tile, of the primary surface. As > a consequence, the remainder of the above paragraph is correct for 32-bit > formats, but not others. Which is why the comment says at the very top that the fb needs to use a 8:8:8:8 format. IIRC that's the only thing the hardware supports. > > This is not a nitpick, because Vulkan and EGL users may want to share > dma_bufs with a fat format like R32G32B32A32_UNORM, and want to have CCS > enabled when possible. As long as the users use the dma_buf only the 3D > engine, and don't submit it to KMS, it's all safe. > > For those users, we should document the bit-pair/cacheline-pair relationship > correctly. And then preface the following detailed explanation and nice ascii > diagrams by saying "this applies only to the 32-bit formats". > > Here's the relevant PRM quote: > > The Color Control Surface (CCS) contains the compression status > of the cache-line pairs. The compression state of the cache-line > pair is specified by 2 bits in the CCS. Each CCS cache-line > represents an area on the main surface of 16x16 sets of 128 byte > Y-tiled cache-line-pairs. CCS is always Y tiled. > > See this Mesa comment for more details: > https://cgit.freedesktop.org/mesa/mesa/tree/src/intel/isl/isl.c?h=17.0#n211 > > > + * Actually only the lower bit of the pair seems to have any effect. > > + * No idea why. 0 in the lower bit would seem to mean not compressed, > > + * and 1 is compressed. The interpreation of the main surface data > > + * when the block is marked compressed is unknown as of now. > > If I recall correctly (it's been several months since I inspected the > bits), the high bit is actually significant. Bit pattern 11 means that > the data in primary surface's cacheline-pair is invalid; the 3D engine > interprets the color of all pixels in that cacheline-pair to be the > clear color stored in RENDER_SURFACE_STATE. Before the display engine > can consume the surface, userspace is required to do a partial resolve, > flushing the clear color into the primary surface. So it makes sense > that the kernel would never observe that bit pattern in an incoming ccs > surface, as long as userspace is doing its job correctly. And it makes > sense that the display engine would ignore the high bit, because there is > no way to provide the clear color to the display engine (at least > according my reading of the docs). > > Jason, does this match your understanding of the high bit? > > > + * > > + * CCS tile is laid out in 8 byte horizontal strips each strip thus corresponds > > + * to a 128x8 pixel are of the main surface. So each 8x8 bytes of the CCS > > + * (1 cacheline) will match an area of 4x2 tiles on the main surface. > > + * > > + * Here is the layout of a full CCS tile, with the 8 byte strips numbered 0-511: > > + * ------------------------ > > + * | 0 | 64 | ... | 448 | > > + * | 1 | 65 | | 449 | > > + * | 2 | 66 | | 450 | > > + * | . | . | | . | > > + * | . | . | | . | > > + * | . | . | | . | > > + * | 63 | 127 | | 511 | > > + * ------------------------ > > + * > > + * This will match an area of 1024x512 pixels on the main surface. > > + * > > + * The CCS plane pitch must be a multiple of the CCS tile width (64 bytes), > > + * and for the purposes of the CCS plane offset we assume cpp==1. As each > > + * byte matches a 16x8 area of the main surface, the dimensions of the CCS > > + * plane will thus appear to be width/16 x height/8. Both planes must be > > + * contained within the same gem object. > > Again, the above paragraphs should clarify that they apply only to 32-bit formats. > > > + */ > > +#define I915_FORMAT_MOD_Y_TILED_CCS fourcc_mod_code(INTEL, 4) > > +#define I915_FORMAT_MOD_Yf_TILED_CCS fourcc_mod_code(INTEL, 5) -- Ville Syrjälä Intel OTC _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel