On Tue, 26 Jan 2021 19:54:12 +0100 Piotr Figiel <figiel@xxxxxxxxxx> wrote: > For userspace checkpoint and restore (C/R) some way of getting process > state containing RSEQ configuration is needed. > > There are two ways this information is going to be used: > - to re-enable RSEQ for threads which had it enabled before C/R > - to detect if a thread was in a critical section during C/R > > Since C/R preserves TLS memory and addresses RSEQ ABI will be restored > using the address registered before C/R. > > Detection whether the thread is in a critical section during C/R is > needed to enforce behavior of RSEQ abort during C/R. Attaching with > ptrace() before registers are dumped itself doesn't cause RSEQ abort. > Restoring the instruction pointer within the critical section is > problematic because rseq_cs may get cleared before the control is > passed to the migrated application code leading to RSEQ invariants not > being preserved. > > To achieve above goals expose the RSEQ structure address and the > signature value with the new per-thread procfs file "rseq". Using "/proc/<pid>/rseq" would be more informative. > fs/exec.c | 2 ++ > fs/proc/base.c | 22 ++++++++++++++++++++++ > kernel/rseq.c | 4 ++++ A Documentation/ update would be appropriate. > 3 files changed, 28 insertions(+) > > diff --git a/fs/exec.c b/fs/exec.c > index 5d4d52039105..5d84f98847f1 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1830,7 +1830,9 @@ static int bprm_execve(struct linux_binprm *bprm, > /* execve succeeded */ > current->fs->in_exec = 0; > current->in_execve = 0; > + task_lock(current); > rseq_execve(current); > + task_unlock(current); There's a comment over the task_lock() implementation which explains what things it locks. An update to that would be helpful. > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -662,6 +662,22 @@ static int proc_pid_syscall(struct seq_file *m, struct pid_namespace *ns, > > return 0; > } > + > +#ifdef CONFIG_RSEQ > +static int proc_pid_rseq(struct seq_file *m, struct pid_namespace *ns, > + struct pid *pid, struct task_struct *task) > +{ > + int res = lock_trace(task); > + > + if (res) > + return res; > + task_lock(task); > + seq_printf(m, "%px %08x\n", task->rseq, task->rseq_sig); > + task_unlock(task); > + unlock_trace(task); > + return 0; > +} Do we actually need task_lock() for this purpose? Would exec_update_lock() alone be adequate and appropriate?