On 07/05, Vladimir Divjak wrote: > > * Problem description / Rationale: > In automotive and/or embedded environments, > the storage capacity to store, and/or > network capabilities to upload > a complete core file can easily be a limiting factor, > making offline issue analysis difficult. To be honest, I don't like the idea... plus the implementation looks horrible to me, sorry. Can't the coredump helper process simply do ptrace(PTRACE_SEIZE, PTRACE_O_TRACEEXIT), close the pipe, and wait for PTRACE_EVENT_EXIT ? Then it can use ptrace() as usual. > +void cdh_unlink_current(void) > +{ > + struct cdh_entry *entry, *next; > + > + mutex_lock(&cdh_mutex); > + list_for_each_entry_safe(entry, next, &cdh_list, cdh_list_link) { Why _safe ? > +bool cdh_ptrace_allowed(struct task_struct *task) > +{ > + struct cdh_entry *entry; > + > + mutex_lock(&cdh_mutex); > + list_for_each_entry(entry, &cdh_list, cdh_list_link) { > + if (task_tgid_nr(entry->task_being_dumped) == task_tgid_nr(task) > + && entry->helper_pid == task_tgid_nr(current)) { > + reinit_completion(&(entry->ptrace_done)); > + wait_task_inactive(entry->task_being_dumped, 0); So. IIUC, this assumes that when cdh_ptrace_allowed() returns the dumping process must be blocked in dump_emit()->wait_for_completion(ptrace_done). And thus ptrace_attach() can safely do task->state = TASK_TRACED. But it is possible that __dump_emit() has already failed and task_being_dumped sleeps in cdh_unlink_current() waiting for cdh_mutex. So it will be running right after cdh_ptrace_allowed() drops cdh_mutex. > +struct cdh_entry *cdh_get_entry_for_current(void) > +{ > + struct cdh_entry *entry; > + > + list_for_each_entry(entry, &cdh_list, cdh_list_link) { > + if (entry->task_being_dumped == current) > + return entry; Why is it safe without cdh_mutex ? > @@ -361,6 +362,8 @@ static int ptrace_attach(struct task_struct *task, long request, > { > bool seize = (request == PTRACE_SEIZE); > int retval; > + bool core_state = false; > + bool core_trace_allowed = false; > > retval = -EIO; > if (seize) { > @@ -392,10 +395,17 @@ static int ptrace_attach(struct task_struct *task, long request, > > task_lock(task); > retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS); > + if (unlikely(task->mm->core_state)) > + core_state = true; task->mm can be NULL > + if (!seize && unlikely(core_state)) { > + if (cdh_ptrace_allowed(task)) > + core_trace_allowed = true; > + } Why !seize ??? What if ptrace_attach() fails after that? Who will wake this task up ? > + /* > + * Core state process does not process signals normally. > + * set directly to TASK_TRACED if allowed by cdh_ptrace_allowed. > + */ > + if (core_trace_allowed) > + task->state = TASK_TRACED; See above. But even if I missed something, this is wrong no matter what, you should never change another task's state. > @@ -821,6 +838,8 @@ static int ptrace_resume(struct task_struct *child, long request, > { > bool need_siglock; > > + cdh_signal_continue(child); takes cdh_mutex :/ Oleg.