On Sat, Oct 09, 2021 at 03:52:02PM +0300, Alexey Dobriyan wrote: > On Fri, Oct 08, 2021 at 04:55:04PM -0700, Kees Cook wrote: > > This makes sure that wchan contains a sensible symbol when a process is > > blocked. > > > Specifically this calls the sleep() syscall, and expects the > > architecture to have called schedule() from a function that has "sleep" > > somewhere in its name. > > This exposes internal kernel symbol to userspace. Correct; we're verifying the results of the wchan output, which produces a kernel symbol for blocked processes. > Why would want to test that? This is part of a larger series refactoring/fixing wchan[1], and we've now tripped over several different failure conditions, so I want to make sure this doesn't regress in the future. > Doing s/sleep/SLEEP/g doesn't change kernel but now the test is broken. Yes; the test would be doing it's job, as that would mean there was a userspace visible change to wchan, so we'd want to catch it and either fix the kernel or update the test to reflect the new reality. > > > For example, on the architectures I tested > > (x86_64, arm64, arm, mips, and powerpc) this is "hrtimer_nanosleep": > > > +/* > > + * Make sure that wchan returns a reasonable symbol when blocked. > > + */ > > Test should be "contains C identifier" then? Nope, this was intentional. Expanding to a C identifier won't catch the "we unwound the stack to the wrong depth and now all wchan shows is '__switch_to'" bug[2]. We're specifically checking that wchan is doing at least the right thing for the most common blocking state. > > > +int main(void) > > +{ > > + char buf[64]; > > + pid_t child; > > + int sync[2], fd; > > + > > + if (pipe(sync) < 0) > > + perror_exit("pipe"); > > + > > + child = fork(); > > + if (child < 0) > > + perror_exit("fork"); > > + if (child == 0) { > > + /* Child */ > > + if (close(sync[0]) < 0) > > + perror_exit("child close sync[0]"); > > + if (close(sync[1]) < 0) > > + perror_exit("child close sync[1]"); > > Redundant close(). Hmm, did you maybe miss the differing array indexes? This closes the reading end followed by the writing end of the child's pipe. > > > + sleep(10); > > + _exit(0); > > + } > > + /* Parent */ > > + if (close(sync[1]) < 0) > > + perror_exit("parent close sync[1]"); > > Redundant close(). It's not, though. This closes the write side of the parent's pipe. > > > + if (read(sync[0], buf, 1) != 0) > > + perror_exit("parent read sync[0]"); > > Racy if child is scheduled out after first close in the child. No, the first close will close the child's read-side of the pipe, which isn't being used. For example, see[3]. The parent's read of /proc/$child/wchan could technically race if the child is scheduled out after the second close() and before the sleep(), but the parent is doing at least 2 syscalls before then. I'm open to a more exact synchronization method, but this should be sufficient. (e.g. Using ptrace to catch sleep syscall entry seemed like overkill.) -Kees [1] https://lore.kernel.org/lkml/20211008111527.438276127@xxxxxxxxxxxxx/ [2] https://lore.kernel.org/lkml/20211008124052.GA976@C02TD0UTHF1T.local/ [3] https://man7.org/tlpi/code/online/diff/pipes/pipe_sync.c.html > > > + snprintf(buf, sizeof(buf), "/proc/%d/wchan", child); > > + fd = open(buf, O_RDONLY); > > + if (fd < 0) { > > + if (errno == ENOENT) > > + return 4; > > + perror_exit(buf); > > + } > > + > > + memset(buf, 0, sizeof(buf)); > > + if (read(fd, buf, sizeof(buf) - 1) < 1) > > + perror_exit(buf); > > + if (strstr(buf, "sleep") == NULL) { > > + fprintf(stderr, "FAIL: did not find 'sleep' in wchan '%s'\n", buf); > > + return 1; > > + } > > + printf("ok: found 'sleep' in wchan '%s'\n", buf); > > + > > + if (kill(child, SIGKILL) < 0) > > + perror_exit("kill"); > > + if (waitpid(child, NULL, 0) != child) { > > + fprintf(stderr, "waitpid: got the wrong child!?\n"); > > + return 1; > > + } > > + > > + return 0; > > +} -- Kees Cook