Re: [PATCH] coredump: allow PTRACE_ATTACH to coredump user mode helper

Oleg Nesterov <oleg@xxxxxxxxxx> · Thu, 8 Jul 2021 14:02:14 +0200

On 07/05, Vladimir Divjak wrote:
>
> * Problem description / Rationale:
> In automotive and/or embedded environments,
> the storage capacity to store, and/or
> network capabilities to upload
> a complete core file can easily be a limiting factor,
> making offline issue analysis difficult.

To be honest, I don't like the idea... plus the implementation looks
horrible to me, sorry.

Can't the coredump helper process simply do
ptrace(PTRACE_SEIZE, PTRACE_O_TRACEEXIT), close the pipe, and wait
for PTRACE_EVENT_EXIT ? Then it can use ptrace() as usual.

> +void cdh_unlink_current(void)
> +{
> +	struct cdh_entry *entry, *next;
> +
> +	mutex_lock(&cdh_mutex);
> +	list_for_each_entry_safe(entry, next, &cdh_list, cdh_list_link) {

Why _safe ?

> +bool cdh_ptrace_allowed(struct task_struct *task)
> +{
> +	struct cdh_entry *entry;
> +
> +	mutex_lock(&cdh_mutex);
> +	list_for_each_entry(entry, &cdh_list, cdh_list_link) {
> +		if (task_tgid_nr(entry->task_being_dumped) == task_tgid_nr(task)
> +		    && entry->helper_pid == task_tgid_nr(current)) {
> +			reinit_completion(&(entry->ptrace_done));
> +			wait_task_inactive(entry->task_being_dumped, 0);

So. IIUC, this assumes that when cdh_ptrace_allowed() returns the dumping
process must be blocked in dump_emit()->wait_for_completion(ptrace_done).
And thus ptrace_attach() can safely do task->state = TASK_TRACED.

But it is possible that __dump_emit() has already failed and task_being_dumped
sleeps in cdh_unlink_current() waiting for cdh_mutex. So it will be running
right after cdh_ptrace_allowed() drops cdh_mutex.

> +struct cdh_entry *cdh_get_entry_for_current(void)
> +{
> +	struct cdh_entry *entry;
> +
> +	list_for_each_entry(entry, &cdh_list, cdh_list_link) {
> +		if (entry->task_being_dumped == current)
> +			return entry;

Why is it safe without cdh_mutex ?

> @@ -361,6 +362,8 @@ static int ptrace_attach(struct task_struct *task, long request,
>  {
>  	bool seize = (request == PTRACE_SEIZE);
>  	int retval;
> +	bool core_state = false;
> +	bool core_trace_allowed = false;
>
>  	retval = -EIO;
>  	if (seize) {
> @@ -392,10 +395,17 @@ static int ptrace_attach(struct task_struct *task, long request,
>
>  	task_lock(task);
>  	retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS);
> +	if (unlikely(task->mm->core_state))
> +		core_state = true;

task->mm can be NULL

> +	if (!seize && unlikely(core_state)) {
> +		if (cdh_ptrace_allowed(task))
> +			core_trace_allowed = true;
> +	}

Why !seize ???

What if ptrace_attach() fails after that? Who will wake this task up ?

> +	/*
> +	 * Core state process does not process signals normally.
> +	 * set directly to TASK_TRACED if allowed by cdh_ptrace_allowed.
> +	 */
> +	if (core_trace_allowed)
> +		task->state = TASK_TRACED;

See above.

But even if I missed something, this is wrong no matter what, you should
never change another task's state.

> @@ -821,6 +838,8 @@ static int ptrace_resume(struct task_struct *child, long request,
>  {
>  	bool need_siglock;
>
> +	cdh_signal_continue(child);

takes cdh_mutex :/

Oleg.