On Fri, Jul 12, 2024, Steven Rostedt wrote: > On Fri, 12 Jul 2024 11:32:30 -0400 > Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote: > > > >>> I was looking at the rseq on request from the KVM call, however it does not > > >>> make sense to me yet how to expose the rseq area via the Guest VA to the host > > >>> kernel. rseq is for userspace to kernel, not VM to kernel. > > > > > > Any memory that is exposed to host userspace can be exposed to the guest. Things > > > like this are implemented via "overlay" pages, where the guest asks host userspace > > > to map the magic page (rseq in this case) at GPA 'x'. Userspace then creates a > > > memslot that overlays guest RAM to map GPA 'x' to host VA 'y', where 'y' is the > > > address of the page containing the rseq structure associated with the vCPU (in > > > pretty much every modern VMM, each vCPU has a dedicated task/thread). > > > > > > A that point, the vCPU can read/write the rseq structure directly. > > So basically, the vCPU thread can just create a virtio device that > exposes the rseq memory to the guest kernel? > > One other issue we need to worry about is that IIUC rseq memory is > allocated by the guest/user, not the host kernel. This means it can be > swapped out. The code that handles this needs to be able to handle user > page faults. This is a non-issue, it will Just Work, same as any other memory that is exposed to the guest and can be reclaimed/swapped/migrated.. If the host swaps out the rseq page, mmu_notifiers will call into KVM and KVM will unmap the page from the guest. If/when the page is accessed by the guest, KVM will fault the page back into the host's primary MMU, and then map the new pfn into the guest.