On Fri, Aug 04, 2017 at 07:56:11AM -0700, Jason Ekstrand wrote: > On August 4, 2017 2:59:56 AM Daniel Stone <daniel@xxxxxxxxxxxxx> wrote: > > > Hi Jason, > > > > On 4 August 2017 at 01:52, Jason Ekstrand <jason@xxxxxxxxxxxxxx> wrote: > > > Previously, the test used the old 64x64 convention that Ville introduced > > > for CCS tiles and not the current 128x32 Y-tile convention. Also, the > > > original scheme for generating the CCS data was over-complicated and > > > didn't work correctly because it assumed you could cut the main surface > > > at an arbitrary Y coordinate. While you clearly *can* do this (the > > > hardware does), it's not a good idea for a generator in a test. The new > > > scheme, introduced here, is entirely based on the relationship between > > > cache-lines in the main surface and the CCS that's documented in the > > > PRM. By keeping everything CCS cache-line aligned, our chances of > > > generating correct data for an arbitrary-size surface are much higher. > > > > Thanks for fixing this! > > > > > + * We need to cut the surface on a CCS cache-line boundary, > > > + * otherwise, we're going to be in trouble when we try to > > > + * generate CCS data for the surface. A cache line in the > > > + * CCS is 16x16 cache-line-pairs in the main surface. 16 > > > + * cache lines is 64 rows high. > > > + */ > > > + half_height = ALIGN(height, 128) / 2; > > > + half_size = half_height * stride; > > > + for (i = 0; i < fb->size / 4; i++) { > > > + if (i < half_size / 4) > > > + ptr[i] = RED; > > > + else > > > + ptr[i] = COMPRESSED_RED; > > > > I toyed with: > > else if (!(i & 0xe)) > > ptr[i] = COMPRESSED_RED; > > else > > ptr[i] = BLACK; > > > > ... to make the failure mode even more obvious. But that still writes > > in (far) more compressed-red pixels than we strictly need for CCS, > > right? Anyway, follow-up patch. > > Yeah, we can probably do quite a bit of patterning so long as we keep the > compressed/uncompressed split simple and make sure our pattern works in > whole cache lines. > > > > +static unsigned int > > > +y_tile_y_pos(unsigned int offset, unsigned int stride) > > > { > > > - return ptr + > > > - ((y & ~0x3f) * stride) + > > > - ((x & ~0x7) * 64) + > > > - ((y & 0x3f) * 8) + > > > - (x & 7); > > > + unsigned int y_tiles, y; > > > + y_tiles = (offset / 4096) / (stride / 128); > > > + y = y_tiles * 32 + ((offset & 0x1f0) >> 4); > > > + return y; > > > } > > > > > > @@ -143,12 +168,26 @@ static void display_fb(data_t *data, int compressed) > > > size[0] = f.pitches[0] * ALIGN(height, 32); > > > > > > if (compressed) { > > > - width = ALIGN(f.width, 16) / 16; > > > - height = ALIGN(f.height, 8) / 8; > > > - f.pitches[1] = ALIGN(width * 1, 64); > > > + /* From the Sky Lake PRM, Vol 12, "Color Control Surface": > > > + * > > > + * "The compression state of the cache-line pair is > > > + * specified by 2 bits in the CCS. Each CCS cache-line > > > + * represents an area on the main surface of 16x16 sets > > > + * of 128 byte Y-tiled cache-line-pairs. CCS is always Y > > > + * tiled." > > > + * > > > + * A "cache-line-pair" for a Y-tiled surface is two > > > + * horizontally adjacent cache lines. When operating in > > > + * bytes and rows, this gives us a scale-down factor of > > > + * 32x16. Since the main surface has a 32-bit format, we > > > + * need to multiply width by 4 to get bytes. > > > + */ > > > > Yeah, this code and comment match better (& helped clarify) my > > understanding of how it works. But given my understanding was mostly > > gleaned from other, borderline incomprehensible, comments, as well as > > manually poking bits and observing the result, maybe that's not a > > ringing endorsement. > > > > > + width = ALIGN(f.width * 4, 32) / 32; > > > + height = ALIGN(f.height, 16) / 16; > > > + f.pitches[1] = ALIGN(width * 1, 128); > > > f.modifier[1] = modifier; > > > f.offsets[1] = size[0]; > > > - size[1] = f.pitches[1] * ALIGN(height, 64); > > > + size[1] = f.pitches[1] * ALIGN(height, 32); > > > > I changed this to f.height rather than height, because otherwise the > > kernel was rejecting the aux buffer for being too small. > > Congratulations, you found a bug in the kernel branch you're running. The > downsized height is definitely what we want and it works fine with my kernel > branch. This is why I'd like someone (post or pre merging the kernel patches) to add a pile of subtests here (in the spirit of kms_addfb_basic) which goes through all kinds of invalid alignment/settings and makes sure the kernel catches them all. The current single invalid testcase we have is imo way too trustful of kernel hacker's ability to get this right :-) Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx