On 2/25/23 18:07, Guenter Roeck wrote: > On Mon, Feb 13, 2023 at 12:56:59PM -0600, Frank Rowand wrote: >> Add tests to exercise the actions that occur when the reference count >> of devicetree nodes decrement to zero and beyond. Decrementing to >> zero triggers freeing memory allocated for the node. >> >> This commit will expose a pr_err() issue in of_node_release(), resulting >> in some kernal warnings and stack traces. >> >> When scripts/dtc/of_unittest_expect processes the console messages, >> it will also report related problems for EXPECT messages due to the >> pr_err() issue: >> ** missing EXPECT begin : 5 >> Signed-off-by: Frank Rowand <frowand.list@xxxxxxxxx> >> --- >> >> There are checkpatch warnings. I have reviewed them and feel >> they can be ignored. >> >> Stack traces from the pr_err() issue: >> > > Unfortunately my test beds trigger a warning if they encounter a backtrace > in a log. Trying to implement workarounds would be just too time consuming, > so I had to disable OF_UNITTEST in my boot tests as consequence of this > patch. Thank you for letting me know about this issue. I was a bit concerned about causing stack traces. I do not want to lose your testing! I'll submit a patch to disable any test that causes a backtrace, then add a boot command line option to enable those tests. That was most people will not see the backtrace, the test will be reported as skipped, and people like me can go to the extra effort to enable the tests. Hopefully the patch get a good reception. -Frank > > Guenter > >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 1 at lib/refcount.c:25 kobject_get+0xa0/0xa4 >> refcount_t: addition on 0; use-after-free. >> Modules linked in: >> CPU: 0 PID: 1 Comm: swapper/0 Tainted: G N 6.2.0-rc1-00005-g774057a35a67 #21 >> Hardware name: Generic DT based system >> unwind_backtrace from show_stack+0x10/0x14 >> show_stack from dump_stack_lvl+0x40/0x4c >> dump_stack_lvl from __warn+0x7c/0x15c >> __warn from warn_slowpath_fmt+0x98/0xcc >> warn_slowpath_fmt from kobject_get+0xa0/0xa4 >> kobject_get from of_node_get+0x14/0x1c >> of_node_get from of_fwnode_get+0x34/0x40 >> of_fwnode_get from fwnode_full_name_string+0x34/0xa0 >> fwnode_full_name_string from device_node_string+0x4dc/0x4f4 >> device_node_string from pointer+0x364/0x598 >> pointer from vsnprintf+0x1f8/0x3d0 >> vsnprintf from vprintk_store+0x134/0x410 >> vprintk_store from vprintk_emit+0x6c/0x234 >> vprintk_emit from vprintk_default+0x20/0x28 >> vprintk_default from _printk+0x30/0x60 >> _printk from of_node_release+0xb0/0xf4 >> of_node_release from kobject_put+0xc4/0x29c >> kobject_put from of_unittest+0x256c/0x2eb8 >> of_unittest from do_one_initcall+0x4c/0x268 >> do_one_initcall from kernel_init_freeable+0x1b4/0x214 >> kernel_init_freeable from kernel_init+0x18/0x130 >> kernel_init from ret_from_fork+0x14/0x2c >> Exception stack(0xf0821fb0 to 0xf0821ff8) >> 1fa0: 00000000 00000000 00000000 00000000 >> 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 >> 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 >> ---[ end trace 0000000000000000 ]--- >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 1 at lib/refcount.c:28 fwnode_full_name_string+0x8c/0xa0 >> refcount_t: underflow; use-after-free. >> Modules linked in: >> CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W N 6.2.0-rc1-00005-g774057a35a67 #21 >> Hardware name: Generic DT based system >> unwind_backtrace from show_stack+0x10/0x14 >> show_stack from dump_stack_lvl+0x40/0x4c >> dump_stack_lvl from __warn+0x7c/0x15c >> __warn from warn_slowpath_fmt+0x98/0xcc >> warn_slowpath_fmt from fwnode_full_name_string+0x8c/0xa0 >> fwnode_full_name_string from device_node_string+0x4dc/0x4f4 >> device_node_string from pointer+0x364/0x598 >> pointer from vsnprintf+0x1f8/0x3d0 >> vsnprintf from vprintk_store+0x134/0x410 >> vprintk_store from vprintk_emit+0x6c/0x234 >> vprintk_emit from vprintk_default+0x20/0x28 >> vprintk_default from _printk+0x30/0x60 >> _printk from of_node_release+0xb0/0xf4 >> of_node_release from kobject_put+0xc4/0x29c >> kobject_put from of_unittest+0x256c/0x2eb8 >> of_unittest from do_one_initcall+0x4c/0x268 >> do_one_initcall from kernel_init_freeable+0x1b4/0x214 >> kernel_init_freeable from kernel_init+0x18/0x130 >> kernel_init from ret_from_fork+0x14/0x2c >> Exception stack(0xf0821fb0 to 0xf0821ff8) >> 1fa0: 00000000 00000000 00000000 00000000 >> 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 >> 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 >> ---[ end trace 0000000000000000 ]--- >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 1 at lib/refcount.c:22 kobject_get+0x8c/0xa4 >> refcount_t: saturated; leaking memory. >> Modules linked in: >> CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W N 6.2.0-rc1-00005-g774057a35a67 #21 >> Hardware name: Generic DT based system >> unwind_backtrace from show_stack+0x10/0x14 >> show_stack from dump_stack_lvl+0x40/0x4c >> dump_stack_lvl from __warn+0x7c/0x15c >> __warn from warn_slowpath_fmt+0x98/0xcc >> warn_slowpath_fmt from kobject_get+0x8c/0xa4 >> kobject_get from of_node_get+0x14/0x1c >> of_node_get from of_fwnode_get+0x34/0x40 >> of_fwnode_get from fwnode_full_name_string+0x34/0xa0 >> fwnode_full_name_string from device_node_string+0x4dc/0x4f4 >> device_node_string from pointer+0x364/0x598 >> pointer from vsnprintf+0x1f8/0x3d0 >> vsnprintf from vscnprintf+0x10/0x24 >> vscnprintf from printk_sprint+0x18/0x194 >> printk_sprint from vprintk_store+0x378/0x410 >> vprintk_store from vprintk_emit+0x6c/0x234 >> vprintk_emit from vprintk_default+0x20/0x28 >> vprintk_default from _printk+0x30/0x60 >> _printk from of_node_release+0xb0/0xf4 >> of_node_release from kobject_put+0xc4/0x29c >> kobject_put from of_unittest+0x256c/0x2eb8 >> of_unittest from do_one_initcall+0x4c/0x268 >> do_one_initcall from kernel_init_freeable+0x1b4/0x214 >> kernel_init_freeable from kernel_init+0x18/0x130 >> kernel_init from ret_from_fork+0x14/0x2c >> Exception stack(0xf0821fb0 to 0xf0821ff8) >> 1fa0: 00000000 00000000 00000000 00000000 >> 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 >> 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 >> ---[ end trace 0000000000000000 ]--- >> >> >> drivers/of/dynamic.c | 14 +- >> .../of/unittest-data/testcases_common.dtsi | 1 + >> drivers/of/unittest-data/tests-lifecycle.dtsi | 8 + >> drivers/of/unittest.c | 148 +++++++++++++++++- >> 4 files changed, 168 insertions(+), 3 deletions(-) >> create mode 100644 drivers/of/unittest-data/tests-lifecycle.dtsi >> >> diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c >> index cd3821a6444f..becb80f762c8 100644 >> --- a/drivers/of/dynamic.c >> +++ b/drivers/of/dynamic.c >> @@ -332,7 +332,19 @@ void of_node_release(struct kobject *kobj) >> /* We should never be releasing nodes that haven't been detached. */ >> if (!of_node_check_flag(node, OF_DETACHED)) { >> pr_err("ERROR: Bad of_node_put() on %pOF\n", node); >> - dump_stack(); >> + >> + /* >> + * of unittests will test this path. Do not print the stack >> + * trace when the error is caused by unittest so that we do >> + * not display what a normal developer might reasonably >> + * consider a real bug. >> + */ >> + if (!IS_ENABLED(CONFIG_OF_UNITTEST) || >> + strcmp(node->parent->full_name, "testcase-data")) { >> + dump_stack(); >> + pr_err("ERROR: next of_node_put() on this node will result in a kboject warning 'refcount_t: underflow; use-after-free.'\n"); >> + } >> + >> return; >> } >> if (!of_node_check_flag(node, OF_DYNAMIC)) >> diff --git a/drivers/of/unittest-data/testcases_common.dtsi b/drivers/of/unittest-data/testcases_common.dtsi >> index 19292bbb4cbb..e7887f2301c1 100644 >> --- a/drivers/of/unittest-data/testcases_common.dtsi >> +++ b/drivers/of/unittest-data/testcases_common.dtsi >> @@ -17,3 +17,4 @@ node-remove { >> #include "tests-address.dtsi" >> #include "tests-platform.dtsi" >> #include "tests-overlay.dtsi" >> +#include "tests-lifecycle.dtsi" >> diff --git a/drivers/of/unittest-data/tests-lifecycle.dtsi b/drivers/of/unittest-data/tests-lifecycle.dtsi >> new file mode 100644 >> index 000000000000..28509a8783a7 >> --- /dev/null >> +++ b/drivers/of/unittest-data/tests-lifecycle.dtsi >> @@ -0,0 +1,8 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> + >> +/ { >> + testcase-data { >> + refcount-node { >> + }; >> + }; >> +}; >> diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c >> index bc0f1e50a4be..db72e208819c 100644 >> --- a/drivers/of/unittest.c >> +++ b/drivers/of/unittest.c >> @@ -54,8 +54,9 @@ static struct unittest_results { >> * Print the expected message only if the current loglevel will allow >> * the actual message to print. >> * >> - * Do not use EXPECT_BEGIN() or EXPECT_END() for messages generated by >> - * pr_debug(). >> + * Do not use EXPECT_BEGIN(), EXPECT_END(), EXPECT_NOT_BEGIN(), or >> + * EXPECT_NOT_END() to report messages expected to be reported or not >> + * reported by pr_debug(). >> */ >> #define EXPECT_BEGIN(level, fmt, ...) \ >> printk(level pr_fmt("EXPECT \\ : ") fmt, ##__VA_ARGS__) >> @@ -63,6 +64,12 @@ static struct unittest_results { >> #define EXPECT_END(level, fmt, ...) \ >> printk(level pr_fmt("EXPECT / : ") fmt, ##__VA_ARGS__) >> >> +#define EXPECT_NOT_BEGIN(level, fmt, ...) \ >> + printk(level pr_fmt("EXPECT_NOT \\ : ") fmt, ##__VA_ARGS__) >> + >> +#define EXPECT_NOT_END(level, fmt, ...) \ >> + printk(level pr_fmt("EXPECT_NOT / : ") fmt, ##__VA_ARGS__) >> + >> static void __init of_unittest_find_node_by_name(void) >> { >> struct device_node *np; >> @@ -1488,6 +1495,7 @@ static int __init unittest_data_add(void) >> struct device_node *next = np->sibling; >> >> np->parent = of_root; >> + /* this will clear OF_DETACHED in np and children */ >> attach_node_and_children(np); >> np = next; >> } >> @@ -2998,6 +3006,141 @@ static void __init of_unittest_overlay(void) >> static inline void __init of_unittest_overlay(void) { } >> #endif >> >> +static void __init of_unittest_lifecycle(void) >> +{ >> + unsigned int refcount; >> + int found_refcount_one = 0; >> + int put_count = 0; >> + struct device_node *np; >> + struct device_node *prev_sibling, *next_sibling; >> + const char *refcount_path = "/testcase-data/refcount-node"; >> + const char *refcount_parent_path = "/testcase-data"; >> + >> + /* >> + * Node lifecycle tests, non-dynamic node: >> + * >> + * - Decrementing refcount to zero via of_node_put() should cause the >> + * attempt to free the node memory by of_node_release() to fail >> + * because the node is not a dynamic node. >> + * >> + * - Decrementing refcount past zero should result in additional >> + * errors reported. >> + */ >> + >> + np = of_find_node_by_path(refcount_path); >> + unittest(np, "find refcount_path \"%s\"\n", refcount_path); >> + if (np == NULL) >> + goto out_skip_tests; >> + >> + while (!found_refcount_one) { >> + >> + if (put_count++ > 10) { >> + unittest(0, "guardrail to avoid infinite loop\n"); >> + goto out_skip_tests; >> + } >> + >> + refcount = kref_read(&np->kobj.kref); >> + if (refcount == 1) >> + found_refcount_one = 1; >> + else >> + of_node_put(np); >> + } >> + >> + EXPECT_BEGIN(KERN_INFO, "OF: ERROR: of_node_release() detected bad of_node_put() on /testcase-data/refcount-node"); >> + >> + /* >> + * refcount is now one, decrementing to zero will result in a call to >> + * of_node_release() to free the node's memory, which should result >> + * in an error >> + */ >> + unittest(1, "/testcase-data/refcount-node is one"); >> + of_node_put(np); >> + >> + EXPECT_END(KERN_INFO, "OF: ERROR: of_node_release() detected bad of_node_put() on /testcase-data/refcount-node"); >> + >> + >> + /* >> + * expect stack trace for subsequent of_node_put(): >> + * __refcount_sub_and_test() calls: >> + * refcount_warn_saturate(r, REFCOUNT_SUB_UAF) >> + * >> + * Not capturing entire WARN_ONCE() trace with EXPECT_*(), just >> + * the first three lines, and the last line. >> + */ >> + EXPECT_BEGIN(KERN_INFO, "------------[ cut here ]------------"); >> + EXPECT_BEGIN(KERN_INFO, "WARNING: <<all>>"); >> + EXPECT_BEGIN(KERN_INFO, "refcount_t: underflow; use-after-free."); >> + EXPECT_BEGIN(KERN_INFO, "---[ end trace <<int>> ]---"); >> + >> + /* refcount is now zero, this should fail */ >> + unittest(1, "/testcase-data/refcount-node is zero"); >> + of_node_put(np); >> + >> + EXPECT_END(KERN_INFO, "---[ end trace <<int>> ]---"); >> + EXPECT_END(KERN_INFO, "refcount_t: underflow; use-after-free."); >> + EXPECT_END(KERN_INFO, "WARNING: <<all>>"); >> + EXPECT_END(KERN_INFO, "------------[ cut here ]------------"); >> + >> + /* >> + * Q. do we expect to get yet another warning? >> + * A. no, the WARNING is from WARN_ONCE() >> + */ >> + EXPECT_NOT_BEGIN(KERN_INFO, "------------[ cut here ]------------"); >> + EXPECT_NOT_BEGIN(KERN_INFO, "WARNING: <<all>>"); >> + EXPECT_NOT_BEGIN(KERN_INFO, "refcount_t: underflow; use-after-free."); >> + EXPECT_NOT_BEGIN(KERN_INFO, "---[ end trace <<int>> ]---"); >> + >> + unittest(1, "/testcase-data/refcount-node is zero, second time"); >> + of_node_put(np); >> + >> + EXPECT_NOT_END(KERN_INFO, "---[ end trace <<int>> ]---"); >> + EXPECT_NOT_END(KERN_INFO, "refcount_t: underflow; use-after-free."); >> + EXPECT_NOT_END(KERN_INFO, "WARNING: <<all>>"); >> + EXPECT_NOT_END(KERN_INFO, "------------[ cut here ]------------"); >> + >> + /* >> + * refcount of zero will trigger stack traces from any further >> + * attempt to of_node_get() node "refcount-node". One example of >> + * this is where of_unittest_check_node_linkage() will recursively >> + * scan the tree, with 'for_each_child_of_node()' doing an >> + * of_node_get() of the children of a node. >> + * >> + * Prevent the stack trace by removing node "refcount-node" from >> + * its parent's child list. >> + * >> + * WARNING: EVIL, EVIL, EVIL: >> + * >> + * Directly manipulate the child list of node /testcase-data to >> + * remove child refcount-node. This is ignoring all proper methods >> + * of removing a child and will leak a small amount of memory. >> + */ >> + >> + np = of_find_node_by_path(refcount_parent_path); >> + unittest(np, "find refcount_parent_path \"%s\"\n", refcount_parent_path); >> + unittest(np, "ERROR: devicetree live tree left in a 'bad state' if test fail\n"); >> + if (np == NULL) >> + return; >> + >> + prev_sibling = np->child; >> + next_sibling = prev_sibling->sibling; >> + if (!strcmp(prev_sibling->full_name, "refcount-node")) { >> + np->child = next_sibling; >> + next_sibling = next_sibling->sibling; >> + } >> + while (next_sibling) { >> + if (!strcmp(next_sibling->full_name, "refcount-node")) >> + prev_sibling->sibling = next_sibling->sibling; >> + prev_sibling = next_sibling; >> + next_sibling = next_sibling->sibling; >> + } >> + of_node_put(np); >> + >> + return; >> + >> +out_skip_tests: >> + unittest(0, "One or more lifecycle tests skipped\n"); >> +} >> + >> #ifdef CONFIG_OF_OVERLAY >> >> /* >> @@ -3502,6 +3645,7 @@ static int __init of_unittest(void) >> of_unittest_match_node(); >> of_unittest_platform_populate(); >> of_unittest_overlay(); >> + of_unittest_lifecycle(); >> >> /* Double check linkage after removing testcase data */ >> of_unittest_check_tree_linkage(); >> -- >> Frank Rowand <frowand.list@xxxxxxxxx> >>