Read the "real_parent" field of task_struct

John Wood <john.wood@xxxxxxx> · Fri, 25 Sep 2020 18:11:42 +0200

Hi,

I'm working in a LSM that uses the task_alloc hook to do some work. This
hook needs to check the "real_parent" field hold by the task_struct
structure. I'm very confused since navigating the source code I see many
different ways to access this field. I don't understand why every method
is used in each case. So I don't know how to implement this access in my
LSM in a secure way.

The real_parent field is defined in the task_struct structure as:

struct task_struct __rcu	*real_parent;

So, as far I can understand this pointer uses the "Read Copy Update"
feature.

Below I show some examples of different access:

--------------------------------------------------------------------------
Example 1
--------------------------------------------------------------------------

void proc_fork_connector(struct task_struct *task)
{
	[...]
	rcu_read_lock();
	parent = rcu_dereference(task->real_parent);
	ev->event_data.fork.parent_pid = parent->pid;
	ev->event_data.fork.parent_tgid = parent->tgid;
	rcu_read_unlock();
	[...]
}

Here to access the real_parent field the code uses rcu_dereference inside
the rcu_read_lock/rcu_read_unlock block.

--------------------------------------------------------------------------
Example 2
--------------------------------------------------------------------------

static struct pid *
get_children_pid(struct inode *inode, struct pid *pid_prev, loff_t pos)
{
	[...]
	read_lock(&tasklist_lock);
	[...]
		if (task && task->real_parent == start &&
		    !(list_empty(&task->sibling))) {
	[...]
	read_unlock(&tasklist_lock);
	[...]
}

Here to access the real_parent field the code reads the pointer directly
inside the read_lock/read_unlock block. No rcu block needed? Why is not
used the rcu_dereference function?

--------------------------------------------------------------------------
Example 3
--------------------------------------------------------------------------

struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
	[...]
	.real_parent	= &init_task,
	.parent		= &init_task,
	[...]
	RCU_POINTER_INITIALIZER(real_cred, &init_cred),
	RCU_POINTER_INITIALIZER(cred, &init_cred),
	[...]
};

Here the initialization is directly. If the pointer is declared __rcu, the
assigment to the real_parent should be with RCU_POINTER_INITIALIZER macro?

--------------------------------------------------------------------------
Example 4
--------------------------------------------------------------------------

static void forget_original_parent(struct task_struct *father,
					struct list_head *dead)
{
	[...]
			RCU_INIT_POINTER(t->real_parent, reaper);
	[...]
}

Here the initialization uses the RCU_INIT_POINTER macro. It's not directly.

--------------------------------------------------------------------------
Example 5
--------------------------------------------------------------------------

SYSCALL_DEFINE2(setpgid, pid_t, pid, pid_t, pgid)
{
	[...]
	rcu_read_lock();

	/* From this point forward we keep holding onto the tasklist lock
	 * so that our parent does not change from under us. -DaveM
	 */
	write_lock_irq(&tasklist_lock);
	[...]

	if (same_thread_group(p->real_parent, group_leader)) {

	[...]
	write_unlock_irq(&tasklist_lock);
	rcu_read_unlock();
	[...]
}

Here to access the real_parent field the code reads the pointer directly
inside the write_lock_irq/write_unlock_irq block nested in a rcu_read_lock/
rcu_read_unlock block.

--------------------------------------------------------------------------
Example 6
--------------------------------------------------------------------------

long keyctl_session_to_parent(void)
{
	[...]
	rcu_read_lock();
	write_lock_irq(&tasklist_lock);

	[...]
	parent = rcu_dereference_protected(me->real_parent,
					   lockdep_is_held(&tasklist_lock));

	[...]
	write_unlock_irq(&tasklist_lock);
	rcu_read_unlock();
	[...]
}

But here to access the real_parent field the code uses rcu_dereference_*
inside the write_lock_irq/write_unlock_irq block nested in a rcu_read_lock/
rcu_read_unlock block. The nested blocks are the sames that in the example 5
but the access is not directly, it uses rcu_dereference_*. Why?

Extracted from the documentation:

[1] The variant rcu_dereference_protected() can be used outside of an RCU
read-side critical section as long as the usage is protected by locks
acquired by the update-side code. This variant avoids the lockdep warning
that would happen when using (for example) rcu_dereference() without
rcu_read_lock() protection. Using rcu_dereference_protected() also has
the advantage of permitting compiler optimizations that rcu_dereference()
must prohibit. The rcu_dereference_protected() variant takes a lockdep
expression to indicate which locks must be acquired by the caller. If the
indicated protection is not provided, a lockdep splat is emitted.

Moreover, why the rcu_dereference_protected is used under a rcu block?

There are more examples but are similars to the ones showed. So my question
is how to read the "real_parent" field correctly. If I can understand all
the above examples I think I will have the knowledge to implement my LSM in
a correct way.

Any help that points me to the right direction will be greatly apreciated.
Some rules to know when and why use a method or another are welcome.

Thanks in advance. Regards,
John Wood

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies