Re: Userspace notifications for observing userfaultfd faults

Axel Rasmussen <axelrasmussen@xxxxxxxxxx> · Tue, 11 May 2021 11:11:58 -0700

On Mon, May 10, 2021 at 5:38 PM Robert O'Callahan <roc@xxxxxxxxx> wrote:
>
> For rr (https://rr-project.org) to support recording and replaying
> applications that use userfaultfd, we need to observe that a task we
> are controlling has blocked on a userfault. Currently this is very
> difficult to do, especially if a task blocks on a userfault on a page
> where some other task has already triggered a userfault, so no new
> userfaultfd event is generated. We also need to observe which page has
> been faulted on so we can determine when the fault has been serviced
> and the task is ready to run again.
>
> I've tried to find workarounds with existing APIs and it doesn't seem
> tractable. See https://github.com/rr-debugger/rr/issues/2852#issuecomment-837514946
> for some thoughts about that.
>
> It seems to me that a sufficient API for us would be a new software
> perf event, e.g. PERF_COUNT_SW_USERFAULTS, with an associated
> PERF_SAMPLE_ADDR that would give us the address of the page. Does that
> sounds like a reasonable thing to add?

Is some combination of bpf and kprobes a possible solution? There are
some seemingly relevant examples here:
https://github.com/iovisor/bpftrace/blob/master/docs/tutorial_one_liners.md

I haven't tried it, but it seems like attaching to handle_userfault()
would give similar information to perf_count_sw_page_faults, but for
userfaults.

>
> Robert O'Callahan