On Fri, Aug 20, 2021, Mathieu Desnoyers wrote: > Without the lazy clear scheme, a rseq c.s. would look like: > > * init(rseq_cs) > * cpu = TLS->rseq::cpu_id_start > * [1] TLS->rseq::rseq_cs = rseq_cs > * [start_ip] ---------------------------- > * [2] if (cpu != TLS->rseq::cpu_id) > * goto abort_ip; > * [3] <last_instruction_in_cs> > * [post_commit_ip] ---------------------------- > * [4] TLS->rseq::rseq_cs = NULL > > But as a fast-path optimization, [4] is not entirely needed because the rseq_cs > descriptor contains information about the instruction pointer range of the critical > section. Therefore, userspace can omit [4], but if the kernel never clears it, it > means that it will have to re-read the rseq_cs descriptor's content each time it > needs to check it to confirm that it is not nested over a rseq c.s.. > > So making the kernel lazily clear the rseq_cs pointer is just an optimization which > ensures that the kernel won't do useless work the next time it needs to check > rseq_cs, given that it has already validated that the userspace code is currently > not within the rseq c.s. currently advertised by the rseq_cs field. Thanks for the explanation, much appreciated!