On Tue, Nov 27, 2018 at 8:38 PM Wang, Dongsheng <dongsheng.wang@xxxxxxxxxxxxxxxx> wrote: > > Hello Kees, > > On 2018/11/28 6:38, Kees Cook wrote: > > On Thu, Nov 22, 2018 at 11:54 PM, Wang Dongsheng > > <dongsheng.wang@xxxxxxxxxxxxxxxx> wrote: > >> When select ARCH_TASK_STRUCT_ON_STACK the first of thread_info variable > >> is overwritten by STACK_END_MAGIC. In fact, the ARCH_TASK_STRUCT_ON_STACK > >> is not a real task on stack, it's only init_task on init_stack. > >> > >> Commit 0500871f21b2 ("Construct init thread stack in the linker script > >> rather than by union") added this macro and put task_strcut into > >> thread_union. This brings us the following possibilities: > >> TASK_ON_STACK THREAD_INFO_IN_TASK STACK > >> ----- <-- thread_info & stack > >> N N | | --- <-- task > >> | | | | > >> ----- --- > >> > >> ----- <-- stack > >> N Y | | --- <-- task(Including thread_info) > >> | | | | > >> ----- --- > >> > >> ----- <-- stack & task & thread_info > >> Y N | | > >> | | > >> ----- > >> > >> ----- <-- stack & task(Including thread_info) > >> Y Y | | > >> | | > >> ----- > >> The kernel has handled the first two cases correctly. > >> > >> For the third case: > >> TASK_ON_STACK: Y. THREAD_INFO_IN_TASK: N. this case > >> should never happen, because the task and thread_info will overlap. So > >> when TASK_ON_STACK is selected, THREAD_INFO_IN_TASK must be selected too. > >> > >> For the fourth case: > >> When task on stack, the end of stack should add a sizeof(task_struct) offset. > >> > >> This patch handled with the third and fourth case. > >> > >> Fixes: 0500871f21b2 ("Construct init thread stack in the linker ...") > >> > >> Signed-off-by: Wang Dongsheng <dongsheng.wang@xxxxxxxxxxxxxxxx> > >> Signed-off-by: Shunyong Yang <shunyong.yang@xxxxxxxxxxxxxxxx> > >> --- > >> arch/Kconfig | 1 + > >> include/linux/sched/task_stack.h | 5 ++++- > >> 2 files changed, 5 insertions(+), 1 deletion(-) > >> > >> diff --git a/arch/Kconfig b/arch/Kconfig > >> index e1e540ffa979..0a2c73e73195 100644 > >> --- a/arch/Kconfig > >> +++ b/arch/Kconfig > >> @@ -251,6 +251,7 @@ config ARCH_HAS_SET_MEMORY > >> # Select if arch init_task must go in the __init_task_data section > >> config ARCH_TASK_STRUCT_ON_STACK > >> bool > >> + depends on THREAD_INFO_IN_TASK || IA64 > > The "IA64" part shouldn't be needed since IA64 already selects it. > > > > Since it's selected, it also can't have a depends, IIUC. > > Since the IA64 thread_info including task_struct, it doesn't need to > select THREAD_INFO_IN_TASK. > So we need to allow IA64 select ARCH_TASK_STRUCT_ON_STACK without > THREAD_INFO. Okay. > >> # Select if arch has its private alloc_task_struct() function > >> config ARCH_TASK_STRUCT_ALLOCATOR > >> diff --git a/include/linux/sched/task_stack.h b/include/linux/sched/task_stack.h > >> index 6a841929073f..624c48defb9e 100644 > >> --- a/include/linux/sched/task_stack.h > >> +++ b/include/linux/sched/task_stack.h > >> @@ -7,6 +7,7 @@ > >> */ > >> > >> #include <linux/sched.h> > >> +#include <linux/sched/task.h> > >> #include <linux/magic.h> > >> > >> #ifdef CONFIG_THREAD_INFO_IN_TASK > >> @@ -25,7 +26,9 @@ static inline void *task_stack_page(const struct task_struct *task) > >> > >> static inline unsigned long *end_of_stack(const struct task_struct *task) > >> { > >> - return task->stack; > >> + if (!IS_ENABLED(CONFIG_ARCH_TASK_STRUCT_ON_STACK) || task != &init_task) > >> + return task->stack; > >> + return (unsigned long *)(task + 1); > >> } > > This seems like a strange place for the change. It feels more like > > init_task has been defined incorrectly. > > The init_task will put into init_stack when ARCH_TASK_STRUCT_ON_STACK is > selected. > include/asm-generic/vmlinux.lds.h: > #define INIT_TASK_DATA(align) \ > . = ALIGN(align); \ > __start_init_task = .; \ > init_thread_union = .; \ > init_stack = .; \ > KEEP(*(.data..init_task)) \ > KEEP(*(.data..init_thread_info)) \ > . = __start_init_task + THREAD_SIZE; \ > __end_init_task = .; > > So we need end_of_stack to offset sizeof(task_struct). Well, I guess I mean I'd rather the end_of_stack() code not be special-cased in the if. The default should be how it was. Perhaps: if (IS_ENABLED(CONFIG_ARCH_TASK_STRUCT_ON_STACK) && task == &init_task) return (unsigned long *)(task + 1); return task->stack; But it still feels strange: why can't task->stack point to the correct place in this special case? -- Kees Cook