Re: [PATCH v3 2/2] lkdtm: Add Shadow Call Stack tests

Dan Li <ashimida@xxxxxxxxxxxxxxxxx> · Fri, 4 Mar 2022 06:34:30 -0800

On 3/3/22 10:42, Kees Cook wrote:
On Wed, Mar 02, 2022 at 11:43:39PM -0800, Dan Li wrote:
Add tests for SCS (Shadow Call Stack) based
backward CFI (as implemented by Clang and GCC).

Cool; thanks for writing these!

+lkdtm-$(CONFIG_LKDTM)		+= scs.o

I'd expect these to be in cfi.c, rather than making a new source file.

Got it.

+static noinline void lkdtm_scs_clear_lr(void)
+{
+	unsigned long *lr = (unsigned long *)__builtin_frame_address(0) + 1;
+
+	asm volatile("str xzr, [%0]\n\t" : : "r"(lr) : "x30");

Is the asm needed here? Why not:

	unsigned long *lr = (unsigned long *)__builtin_frame_address(0) + 1;

	*lr = 0;

Yeah, with "volatile", this one looks better.

+
+/*
+ * This tries to call a function protected by Shadow Call Stack,
+ * which corrupts its own return address during execution.
+ * Due to the protection, the corruption will not take effect
+ * when the function returns.
+ */
+void lkdtm_CFI_BACKWARD_SHADOW(void)

I think these two tests should be collapsed into a single one.

It seems that there is currently no cross-line matching in
selftests/lkdtm/run.sh, if we put these two into one function and
assume we could make noscs_set_lr _survivable_ (like in your example).

Then we could only match "CFI_BACKWARD_SHADOW ok: scs takes effect."
in texts.txt

But if the test result is:
XPASS: Unexpectedly survived lr corruption without scs?
ok: scs takes effect.

It may not be a real pass, but the xxx_set_lr function doesn't work.

+{
+#ifdef CONFIG_ARM64
+	if (!IS_ENABLED(CONFIG_SHADOW_CALL_STACK)) {
+		pr_err("FAIL: kernel not built with CONFIG_SHADOW_CALL_STACK\n");
+		return;
+	}
+
+	pr_info("Trying to corrupt lr in a function with scs protection ...\n");
+	lkdtm_scs_clear_lr();
+
+	pr_err("ok: scs takes effect.\n");
+#else
+	pr_err("XFAIL: this test is arm64-only\n");
+#endif

This is slightly surprising -- we have no detection when a function has
its non-shadow-stack return address corrupted: it just _ignores_ the
value stored there. That seems like a missed opportunity for warning
about an unexpected state.

Yes.
Actually I used to try in the plugin to add a detection before the function
returns, and call a callback when a mismatch is found. But since almost
every function has to be instrumented, the performance penalty is
improved from <3% to ~20% (rough calculation, should still be optimized).

+}
+
+/*
+ * This tries to call a function not protected by Shadow Call Stack,
+ * which corrupts its own return address during execution.
+ */
+void lkdtm_CFI_BACKWARD_SHADOW_WITH_NOSCS(void)
+{
+#ifdef CONFIG_ARM64
+	if (!IS_ENABLED(CONFIG_SHADOW_CALL_STACK)) {
+		pr_err("FAIL: kernel not built with CONFIG_SHADOW_CALL_STACK\n");
+		return;

Other tests try to give some hints about failures, e.g.:

		pr_err("FAIL: cannot change for SCS\n");
		pr_expected_config(CONFIG_SHADOW_CALL_STACK);

Though, having the IS_ENABLED in there makes me wonder if this test
should instead be made _survivable_ on failure. Something like this,
completely untested:

#ifdef CONFIG_ARM64
static noinline void lkdtm_scs_set_lr(unsigned long *addr)
{
	unsigned long **lr = (unsigned long **)__builtin_frame_address(0) + 1;
	*lr = addr;
}

/* Function with __noscs attribute clears its return address. */
static noinline void __noscs lkdtm_noscs_set_lr(unsigned long *addr)
{
	unsigned long **lr = (unsigned long **)__builtin_frame_address(0) + 1;
	*lr = addr;
}
#endif

void lkdtm_CFI_BACKWARD_SHADOW(void)
{
#ifdef CONFIG_ARM64

	/* Verify the "normal" condition of LR corruption working. */
	do {
		/* Keep label in scope to avoid compiler warning. */
		if ((volatile int)0)
			goto unexpected;

		pr_info("Trying to corrupt lr in a function without scs protection ...\n");
		lkdtm_noscs_set_lr(&&expected);

unexpected:
		pr_err("XPASS: Unexpectedly survived lr corruption without scs?!\n");
		break;

expected:
		pr_err("ok: lr corruption redirected without scs.\n");
	} while (0);

	do {
		/* Keep labe in scope to avoid compiler warning. */
		if ((volatile int)0)
			goto good_scs;

		pr_info("Trying to corrupt lr in a function with scs protection ...\n");
		lkdtm_scs_set_lr(&&bad_scs);

good_scs:
		pr_info("ok: scs takes effect.\n");
		break;

bad_scs:
		pr_err("FAIL: return address rewritten!\n");
		pr_expected_config(CONFIG_SHADOW_CALL_STACK);
	} while (0);
#else
	pr_err("XFAIL: this test is arm64-only\n");
#endif
}

Thanks for the example, Kees :)
This code (with a little modification) works correctly with clang 12,
but to make sure it's always correct, I think we might need to add the
__attribute__((optnone)) attribute to it, because under -O2 the result
doesn't seem to be "very stable" (as in your example in the next email).

And we should, actually, be able to make the "set_lr" functions be
arch-specific, leaving the test itself arch-agnostic....

I'm not sure if my understanding is correct, do it means we should
remove the "#ifdef CONFIG_ARM64" in lkdtm_CFI_BACKWARD_SHADOW?

Then we may not be able to distinguish between failures caused by
platform unsupported (XFAIL) and features not enabled (or not
working properly).

Thanks,
Dan.