On Mon, May 10, 2021 at 5:38 PM Robert O'Callahan <roc@xxxxxxxxx> wrote: > > For rr (https://rr-project.org) to support recording and replaying > applications that use userfaultfd, we need to observe that a task we > are controlling has blocked on a userfault. Currently this is very > difficult to do, especially if a task blocks on a userfault on a page > where some other task has already triggered a userfault, so no new > userfaultfd event is generated. We also need to observe which page has > been faulted on so we can determine when the fault has been serviced > and the task is ready to run again. > > I've tried to find workarounds with existing APIs and it doesn't seem > tractable. See https://github.com/rr-debugger/rr/issues/2852#issuecomment-837514946 > for some thoughts about that. > > It seems to me that a sufficient API for us would be a new software > perf event, e.g. PERF_COUNT_SW_USERFAULTS, with an associated > PERF_SAMPLE_ADDR that would give us the address of the page. Does that > sounds like a reasonable thing to add? Is some combination of bpf and kprobes a possible solution? There are some seemingly relevant examples here: https://github.com/iovisor/bpftrace/blob/master/docs/tutorial_one_liners.md I haven't tried it, but it seems like attaching to handle_userfault() would give similar information to perf_count_sw_page_faults, but for userfaults. > > Robert O'Callahan