Re: [PATCH v18 2/2] drm/i915/selftests: Introduce live tests of private PAT management

Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> · Fri, 22 Sep 2017 13:34:16 +0300

On Fri, 2017-09-22 at 01:28 +0800, Zhi Wang wrote:
> Introduce two live tests of private PAT management:
> 
> igt_ppat_init - This test is to check if all the PPAT configurations are
> written into HW.
> 
> igt_ppat_get - This test performs several sub-tests on intel_ppat_get()
> and intel_ppat_put().
> 
> The "perfect match" test case will try to get a PPAT entry with an existing
> value, then check if the returned PPAT entry is the same one.
> 
> The "alloc entries" test case will run out of PPAT table, and check if all
> the requested values are put into the newly allocated PPAT entries.
> 
> The negative test case will try to generate a new PPAT value, and get it
> when PPAT table is full.
> 
> The "partial match" test case will generate a parital matched value from
> the existing PPAT table and try to match it.
> 
> The "re-alloc" test case will try to free and then allocate a new entry
> when the PPAT table is full.
> 
> The "put entries" test case will free all the PPAT entries that allocated
> in "alloc entries" test case. It will check if the values of freed PPAT
> entries turn into ppat->clear_value.
> 
> v18:
> 
> - Refine the test to catch a corner case.
> 
> v10:
> 
> - Refine code structure.
> - Introduce "re-alloc" test case. (Chris)
> 
> v9:
> 
> - Refine generate_new_value(). (Chris)
> - Refine failure output. (Chris)
> - Refine test flow of "perfect match". (Chris)
> - Introduce another negative test case after "partial match". (Chris)
> 
> v8:
> 
> - Remove noisy output. (Chris)
> - Add negative test case. (Chris)
> 
> Cc: Ben Widawsky <benjamin.widawsky@xxxxxxxxx>
> Cc: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx>
> Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
> Suggested-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Zhi Wang <zhi.a.wang@xxxxxxxxx>

<SNIP>

> +static int igt_ppat_check(void *arg)
> +{
> +	struct drm_i915_private *i915 = arg;
> +	int ret;
> +
> +	if (!i915->ppat.max_entries)
> +		return 0;
> +
> +	if (INTEL_GEN(i915) >= 10)
> +		ret = check_cnl_ppat(i915);
> +	else
> +		ret = check_bdw_ppat(i915);
> +
> +	if (ret)
> +		pr_err("check PPAT failed\n");

This is just extra noise, both individual dest already say check PPAT
failed.

> +
> +	return ret;
> +}
> +
> +static bool value_is_new(struct intel_ppat *ppat, u8 value)

Bit overly generic name for selftests/i915_gem_gtt.c, how about
'ppat_is_new' or 'ppat_find_match_all'.

> +{
> +	int i;
> +
> +	for_each_set_bit(i, ppat->used, ppat->max_entries) {
> +		if (value != ppat->entries[i].value)

		if (value == ppat->entries[i])
			return i;

And drop the braces around for.

	return -ENOENT;

> +static bool value_for_partial_test(struct intel_ppat *ppat, u8 value)

Maybe 'ppat_find_match_ca', similar changes to above function to return
index.

> +{
> +	int i;
> +
> +	if (!value_is_new(ppat, value))
> +		return false;
> +
> +	/*
> +	 * At least, there should be one entry whose cache attribute is
> +	 * same with the required value.
> +	 */

Comment says what the code does, so can be dropped.

> +	for_each_set_bit(i, ppat->used, ppat->max_entries) {
> +		if (GEN8_PPAT_GET_CA(value) !=
> +		    GEN8_PPAT_GET_CA(ppat->entries[i].value))

Again '==' and return true.

> +			continue;
> +
> +		return true;
> +	}
> +	return false;
> +}
> +
> +static bool value_for_negative_test(struct intel_ppat *ppat, u8 value)

Maybe 'ppat_find_not_match_ca', same with returning index / -ENOENT.

> +{
> +	int i;
> +
> +	if (!value_is_new(ppat, value))
> +		return false;
> +
> +	/*
> +	 * cache attribute has to be different, so i915_ppat_get() would
> +	 * allocate a new entry.
> +	 */

Ditto.

> +	for_each_set_bit(i, ppat->used, ppat->max_entries) {
> +		if (GEN8_PPAT_GET_CA(value) ==
> +		    GEN8_PPAT_GET_CA(ppat->entries[i].value))
> +			return false;

You can drop the braces, here too.

> +	}
> +	return true;
> +}
> +
> +static u8 generate_new_value(struct intel_ppat *ppat,
> +			     bool (*check_value)(struct intel_ppat *, u8))

'ppat_generate_new' or so. Also, by definition of _new, this function
should be doing the check if it already exists, not the filter function
(unnecessary code duplication).

> +{
> +	u8 ca[] = { GEN8_PPAT_WB, GEN8_PPAT_WT, GEN8_PPAT_UC, GEN8_PPAT_WC };
> +	u8 tc[] = { GEN8_PPAT_LLC, GEN8_PPAT_LLCELLC, GEN8_PPAT_LLCeLLC };
> +	u8 age[] = { GEN8_PPAT_AGE(3), GEN8_PPAT_AGE(2), GEN8_PPAT_AGE(1),
> +		     GEN8_PPAT_AGE(0) };
> +	int ca_index, tc_index, age_index;
> +	u8 value;
> +
> +#define for_each_ppat_attr(ca_index, tc_index, age_index) \
> +	for ((ca_index) = 0 ; (ca_index) < ARRAY_SIZE(ca); (ca_index)++) \
> +	for ((tc_index) = 0; (tc_index) < ARRAY_SIZE(tc); (tc_index)++) \
> +	for ((age_index) = 0; (age_index) < ARRAY_SIZE(age); (age_index)++)
> +
> +	for_each_ppat_attr(ca_index, tc_index, age_index) {
> +		value = age[age_index] | ca[ca_index] | tc[tc_index];
> +		if (check_value(ppat, value))
> +			return value;
> +	}
> +#undef for_each_ppat_attr
> +	return 0;

	-ENOSPC?

> +}
> +
> +static const struct intel_ppat_entry *
> +generate_and_check_new_value(struct intel_ppat *ppat)
> +{
> +	struct drm_i915_private *i915 = ppat->i915;
> +	const struct intel_ppat_entry *entry;
> +	u8 value;

Maybe swap these two declarations for extra fine tune.

> +	DECLARE_BITMAP(used, INTEL_MAX_PPAT_ENTRIES);
> +
> +	value = generate_new_value(ppat, value_is_new);
> +	if (!value) {
> +		pr_err("fail to generate new value\n");
> +		return ERR_PTR(-EINVAL);
> +	}

It's better to propagate errors, as much as possible, will be easier
for the next person to extend the tests, possibly adding new error
conditions.

	value = ppat_generate_new(..);
	if (IS_ERR(value)) {
		..
		return ERR_PTR(value);
	}

> +	bitmap_copy(used, ppat->used, ppat->max_entries);
> +
> +	entry = intel_ppat_get(i915, value);
> +	if (IS_ERR(entry)) {
> +		pr_err("fail to get new entry\n");
> +		return ERR_PTR(-EINVAL);
> +	}

Shouldn't the selftest very strictly differentiate between, "unable to
run test" (resources are in use, for example) and "i915 driver failed
the test". Here both problems translate into -EINVAL. Please check the
drm_mm tests for examples.

> +
> +	if (entry->value != value) {
> +		pr_err("value is not expected, expected %x found %x\n",
> +				value, entry->value);
> +		goto err;
> +	}
> +
> +	if (bitmap_equal(used, ppat->used, ppat->max_entries)) {
> +		pr_err("fail to alloc a new entry\n");
> +		goto err;
> +	}

This would be the more logical test to do first, I don't think it gets
triggered otherwise.

> +
> +	return entry;

Newline here.

> +err:
> +	intel_ppat_put(entry);

Usually here too.

> +	return ERR_PTR(-EINVAL);
> +}
> +
> +static int put_and_check_entry(const struct intel_ppat_entry *entry)

add word 'ppat' somewhere.

> +{
> +	intel_ppat_put(entry);
> +
> +	if (entry->value != entry->ppat->clear_value) {
> +		pr_err("fail to put ppat value\n");

Wouldn't 'ppat value was not cleared' be more informative, if
intel_ppat_put returned error, this would be appropriate.

> +		return -EINVAL;
> +	}
> +	return 0;
> +}
> +
> +static int perform_perfect_match_test(struct intel_ppat *ppat)

s/perform/ppat/

> +{
> +	struct drm_i915_private *i915 = ppat->i915;
> +	const struct intel_ppat_entry *entry;
> +	int i;
> +
> +	for_each_set_bit(i, ppat->used, ppat->max_entries) {
> +		entry = intel_ppat_get(i915, ppat->entries[i].value);
> +		if (IS_ERR(entry))
> +			return PTR_ERR(entry);
> +

It's not obvious without looking elsewhere in the code, so it might be
worth mentioning that all entries have been generated to be unique.

		/* All generated PPAT entries are unique */

> +		if (entry != &ppat->entries[i]) {
> +			pr_err("entry is not expected\n");

Do add "goto err" and this becomes.

		if (entry != ..)
			goto err;

Just by glancing at the code, you can see we're doing an extremely 
simple test. Then, without disrupting the code flow, at err: label, you
can add details about what was expected and what we got.

> +			intel_ppat_put(entry);
> +			return -EINVAL;
> +		}
> +		intel_ppat_put(entry);
> +	}
> +	return 0;
> +}
> +
> +static int perform_negative_test(struct intel_ppat *ppat)
> +{
> +	struct drm_i915_private *i915 = ppat->i915;
> +	const struct intel_ppat_entry *entry;
> +	u8 value;
> +
> +	value = generate_new_value(ppat, value_for_negative_test);
> +	if (!value) {
> +		pr_err("fail to generate new value for negative test 2\n");

		Drop 'test 2'.

> +		return -EINVAL;
> +	}
> +
> +	entry = intel_ppat_get(i915, value);
> +	if (!IS_ERR(entry) || PTR_ERR(entry) != -ENOSPC) {
> +		pr_err("fail on negative test\n");
> +		return -EINVAL;
> +	}

Again, differentiate "could not run logic" vs. "failure in logic". Lets
try to propagate the error to the topmost function, and we have
possibility to report SKIP vs FAILURE to I-G-T.

> +	return 0;
> +}
> +
> +static int perform_partial_match_test(struct intel_ppat *ppat)
> +{
> +	struct drm_i915_private *i915 = ppat->i915;
> +	const struct intel_ppat_entry *entry;
> +	u8 value;
> +
> +	value = generate_new_value(ppat, value_for_partial_test);

These will read much better when the filter function has meaningful
name.

> +	if (!value) {
> +		pr_err("fail to generate new value for partial test\n");
> +		return -EINVAL;

This would be a 'could not perform test' instead of test failure.

> +	}
> +
> +	entry = intel_ppat_get(i915, value);
> +	if (IS_ERR(entry)) {
> +		pr_err("fail to get new entry\n");
> +		return PTR_ERR(entry);

This would be too a 'don't have space to execute test error', so here
propagating the error would be good idea.

> +	}
> +
> +	if (!(entry->value != value &&
> +	    GEN8_PPAT_GET_CA(entry->value) == GEN8_PPAT_GET_CA(value))) {

Wrong indent, you're negating both, so need to indent to the inner '('

Instead, negating the whole thing allows dropping the fancy indent and
makes the condition much more understandable:

	if (entry->value == value || .._CA != _CA

'We should not get equal value, and we should not get different _CA'

That's what partial match is about :)

> +		pr_err("value is not expected, expected %x found %x\n",
> +				value, entry->value);
> +		intel_ppat_put(entry);
> +		return -EINVAL;
> +	}
> +
> +	intel_ppat_put(entry);
> +	return 0;
> +}
> +
> +static int igt_ppat_get(void *arg)
> +{
> +	struct drm_i915_private *i915 = arg;
> +	struct intel_ppat *ppat = &i915->ppat;
> +	const struct intel_ppat_entry **entries, **p;
> +	const struct intel_ppat_entry *entry;
> +	unsigned int size = 0;
> +	int i, ret;
> +
> +	if (!ppat->max_entries)
> +		return 0;
> +
> +	ret = igt_ppat_check(i915);
> +	if (ret)
> +		return ret;
> +
> +	/* case 1: alloc new entries */
> +	entries = NULL;
> +	ret = 0;

This ret = 0 is not needed, we already have preceding
'if (ret) return ret;'

Then drop the extra newline, we're looping to change entries, it should
be close to the beginning of loop.

> +
> +	while (!bitmap_full(ppat->used, ppat->max_entries)) {
> +		p = krealloc(entries, (size + 1) *
> +				   sizeof(struct intel_ppat_entry *),
> +				   GFP_KERNEL);

Weird indent, you can even reduce line length with
(size + 1)*sizeof(*entries). 

> +		if (!p) {
> +			ret = -ENOMEM;
> +			goto ppat_put;
> +		}
> +
> +		entries = p;
> +
> +		p = &entries[size++];
> +		*p = NULL;

I think we could allocate at one go, it's literally a couple of
pointers (ppat->max_entries), I think this is bit much for allocating
pointers.

> +
> +		entry = generate_and_check_new_value(ppat);
> +		if (IS_ERR(entry)) {
> +			ret = PTR_ERR(entry);
> +			pr_err("fail on alloc new entries test\n");
> +			goto ppat_put;
> +		}
> +		*p = entry;
> +	}
> +
> +	/* case 2: perfect match */
> +	ret = perform_perfect_match_test(ppat);
> +	if (ret) {
> +		pr_err("fail on perfect match test\n");
> +		return ret;
> +	}
> +
> +	/* case 3: negative test 1, suppose PPAT table is full now */
> +	ret = perform_negative_test(ppat);
> +	if (ret) {
> +		pr_err("fail on negative test 1\n");
> +		goto ppat_put;
> +	}
> +
> +	/* case 4: partial match */
> +	ret = perform_partial_match_test(ppat);
> +	if (ret) {
> +		pr_err("fail on partial match test\n");
> +		goto ppat_put;
> +	}
> +
> +	/* case 3: negative test 2, suppose PPAT table is still full now */
> +	ret = perform_negative_test(ppat);
> +	if (ret) {
> +		pr_err("fail on negative test 2\n");
> +		goto ppat_put;
> +	}
> +
> +	/* case 5: re-alloc test, make a hole and it should work again */
> +	if (entries) {

entries should not be NULL here, we're skipping function krealloc
fails. So causes extra indent.

> +		for (i = 0; i < size; i++) {
> +			entry = entries[i];
> +
> +			ret = put_and_check_entry(entry);
> +			entries[i] = NULL;
> +			if (ret) {
> +				pr_err("fail on re-alloc test\n");
> +				goto ppat_put;
> +			}
> +
> +			entry = generate_and_check_new_value(ppat);
> +			if (IS_ERR(entry)) {
> +				ret = PTR_ERR(entry);
> +				pr_err("fail on re-alloc test\n");
> +				goto ppat_put;
> +			}
> +			entries[i] = entry;
> +		}
> +	}
> +
> +ppat_put:
> +	if (entries) {

This extra indent can go when we we have goto out; at the allocation
error for entries.

> +		for (i = 0; i < size; i++) {
> +			if (IS_ERR(entries[i]) || !entries[i])

IS_ERR_OR_NULL

> +				continue;
> +
> +			if (ret)
> +				intel_ppat_put(entry);
> +			else
> +				ret = put_and_check_entry(entries[i]);
> +		}
> +		kfree(entries);
> +		entries = NULL;
> +	}

out:

> +	if (ret)
> +		return ret;
> +
> +	return igt_ppat_check(i915);
> +}
> +

With the modifications we should be very much ready with the kselftest,
as I see it.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx