Sorry I missed sched_ext folks, adding them as well. Thanks, Vineeth On Wed, Apr 3, 2024 at 10:01 AM Vineeth Pillai (Google) <vineeth@xxxxxxxxxxxxxxx> wrote: > > Double scheduling is a concern with virtualization hosts where the host > schedules vcpus without knowing whats run by the vcpu and guest schedules > tasks without knowing where the vcpu is physically running. This causes > issues related to latencies, power consumption, resource utilization > etc. An ideal solution would be to have a cooperative scheduling > framework where the guest and host shares scheduling related information > and makes an educated scheduling decision to optimally handle the > workloads. As a first step, we are taking a stab at reducing latencies > for latency sensitive workloads in the guest. > > v1 RFC[1] was posted in December 2023. The main disagreement was in the > implementation where the patch was making scheduling policy decisions > in kvm and kvm is not the right place to do it. The suggestion was to > move the polcy decisions outside of kvm and let kvm only handle the > notifications needed to make the policy decisions. This patch series is > an iterative step towards implementing the feature as a layered > design where the policy could be implemented outside of kvm as a > kernel built-in, a kernel module or a bpf program. > > This design comprises mainly of 4 components: > > - pvsched driver: Implements the scheduling policies. Register with > host with a set of callbacks that hypervisor(kvm) can use to notify > vcpu events that the driver is interested in. The callback will be > passed in the address of shared memory so that the driver can get > scheduling information shared by the guest and also update the > scheduling policies set by the driver. > - kvm component: Selects the pvsched driver for a guest and notifies > the driver via callbacks for events that the driver is interested > in. Also interface with the guest in retreiving the shared memory > region for sharing the scheduling information. > - host kernel component: Implements the APIs for: > - pvsched driver for register/unregister to the host kernel, and > - hypervisor for assingning/unassigning driver for guests. > - guest component: Implements a framework for sharing the scheduling > information with the pvsched driver through kvm. > > There is another component that we refer to as pvsched protocol. This > defines the details about shared memory layout, information sharing and > sheduling policy decisions. The protocol need not be part of the kernel > and can be defined separately based on the use case and requirements. > Both guest and the selected pvsched driver need to match the protocol > for the feature to work. Protocol shall be identified by a name and a > possible versioning scheme. Guest will advertise the protocol and then > the hypervisor can assign the driver implementing the protocol if it is > registered in the host kernel. > > This patch series only implements the first 3 components. Guest side > implementation and the protocol framework shall come as a separate > series once we finalize rest of the design. > > This series also implements a sample bpf program and a kernel-builtin > pvsched drivers. They do not do any real stuff now, but just skeletons > to demonstrate the feature. > > Rebased on 6.8.2. > > [1]: https://lwn.net/Articles/955145/ > > Vineeth Pillai (Google) (5): > pvsched: paravirt scheduling framework > kvm: Implement the paravirt sched framework for kvm > kvm: interface for managing pvsched driver for guest VMs > pvsched: bpf support for pvsched > selftests/bpf: sample implementation of a bpf pvsched driver. > > Kconfig | 2 + > arch/x86/kvm/Kconfig | 13 + > arch/x86/kvm/x86.c | 3 + > include/linux/kvm_host.h | 32 +++ > include/linux/pvsched.h | 102 +++++++ > include/uapi/linux/kvm.h | 6 + > kernel/bpf/bpf_struct_ops_types.h | 4 + > kernel/sysctl.c | 27 ++ > .../testing/selftests/bpf/progs/bpf_pvsched.c | 37 +++ > virt/Makefile | 2 +- > virt/kvm/kvm_main.c | 265 ++++++++++++++++++ > virt/pvsched/Kconfig | 12 + > virt/pvsched/Makefile | 2 + > virt/pvsched/pvsched.c | 215 ++++++++++++++ > virt/pvsched/pvsched_bpf.c | 141 ++++++++++ > 15 files changed, 862 insertions(+), 1 deletion(-) > create mode 100644 include/linux/pvsched.h > create mode 100644 tools/testing/selftests/bpf/progs/bpf_pvsched.c > create mode 100644 virt/pvsched/Kconfig > create mode 100644 virt/pvsched/Makefile > create mode 100644 virt/pvsched/pvsched.c > create mode 100644 virt/pvsched/pvsched_bpf.c > > -- > 2.40.1 >