RE: [PATCH v5 3/5] hyperv: Enable the hypercall output page for the VTL mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Stanislav Kinsburskii <skinsburskii@xxxxxxxxxxxxxxxxxxx> Sent: Monday, January 6, 2025 9:23 AM
> 
> On Fri, Jan 03, 2025 at 10:08:05PM +0000, Michael Kelley wrote:
> > From: Stanislav Kinsburskii <skinsburskii@xxxxxxxxxxxxxxxxxxx> Sent: Friday, January
> 3, 2025 11:20 AM
> > >
> > > On Mon, Dec 30, 2024 at 10:09:39AM -0800, Roman Kisel wrote:
> > > > Due to the hypercall page not being allocated in the VTL mode,
> > > > the code resorts to using a part of the input page.
> > > >
> > > > Allocate the hypercall output page in the VTL mode thus enabling
> > > > it to use it for output and share code with dom0.
> > > >
> > > > Signed-off-by: Roman Kisel <romank@xxxxxxxxxxxxxxxxxxx>
> > > > ---
> > > >  drivers/hv/hv_common.c | 6 +++---
> > > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> > > > index c6ed3ba4bf61..c983cfd4d6c0 100644
> > > > --- a/drivers/hv/hv_common.c
> > > > +++ b/drivers/hv/hv_common.c
> > > > @@ -340,7 +340,7 @@ int __init hv_common_init(void)
> > > >  	BUG_ON(!hyperv_pcpu_input_arg);
> > > >
> > > >  	/* Allocate the per-CPU state for output arg for root */
> > > > -	if (hv_root_partition) {
> > > > +	if (hv_root_partition || IS_ENABLED(CONFIG_HYPERV_VTL_MODE)) {
> > >
> > > This check doesn't look nice.
> > > First of all, IS_ENABLED(CONFIG_HYPERV_VTL_MODE) doesn't mean that this
> > > particular kernel is being booted in VTL other that VTL0.
> >
> > Actually, it does mean that. Kernels built with CONFIG_HYPERV_VTL_MODE=y
> > will not run as a normal guest in VTL 0. See the third paragraph of the
> > "help" section for HYPERV_VTL_MODE in drivers/hv/Kconfig.
> >
> 
> Thanks for pointing to this piece.
> 
> This limitation looks aritificial to me and as VTL support in Linux is
> currently being extended beyond Underhill support, keeping this
> restriction makes some further development in scope of LVBS support
> complicated and error prone due to potential ABI mismatches between
> Linux kernels in different VTLs.
> 
> IOW, making the same kernel properly bootable (or - worse - explicitly
> un-bootable) in different VTLs is a more robust way in the long run.

The reason for the limitation is the sequencing of early Hyper-V-related
initialization steps. Knowing at runtime whether you are running at
VTL0 or some other VTL requires making a hypercall in get_vtl().
Unfortunately, the machinery for making a hypercall (setting the guest
OS ID, and allocating the x86 hypercall page) is established relatively late
during initialization, in hyperv_init(). But running in other than VTL0
requires the initializations that are done in hv_vtl_init_platform(), which
must be done much earlier. There's no clear way out of this conundrum
purely on the Linux guest side.

To solve the conundrum on x86, one possibility to consider is having
Hyper-V make HV_REGISTER_VSM_VP_STATUS available as a synthetic
MSR, which can be read without making a hypercall. This register could be
read in ms_hyperv_init_platform() to know if running at VTL0 or not.
Using synthetic MSRs is how other aspects of early Hyper-V-related
initialization is done in Linux on x86.

I think there's some discussion on the x86 sequencing issues on LKML
from when the VTL code was first added. I was part of that discussion, but
don't remember all the details. There might additional issues raised in
that discussion.

The sequencing issues would also need to be sorted out on the arm64
side, as they are different from x86. We don't have an early Hyper-V
specific hook like ms_hyperv_init_platform() on the arm64 side, so
that might be problem. But on the flip side, we also don't have the
x86-specific messiness that hv_vtl_init_platform() handles. Also,
there are no synthetic MSRs on arm64, so register accesses always
use hypercalls, but there's no hypercall page needed. On balance, I
think getting VTL stuff initialized on arm64 will be easier, but I'm not sure.

Michael

> 
> Thanks,
> Stas
> 
> > Michael
> >
> > >
> > > Second, current approach is that a VTL1+ kernel is a different build from VTL0
> > > kernel and thus relying on the config option looks reasonable. However,
> > > this is not true in general case.
> > >
> > > I'd suggest one of the following three options:
> > >
> > > 1. Always allocate per-cpu output pages. This is wasteful for the VTL0
> > > guest case, but it might worth it for overall simplification.
> > >
> > > 2. Don't touch this code and keep the cnage in the Underhill tree.
> > >
> > > 3. Introduce a configuration option (command line or device tree or
> > > both) and use it instead of the kernel config option.
> > >
> > > Thanks,
> > > Stas
> > >
> > > >  		hyperv_pcpu_output_arg = alloc_percpu(void *);
> > > >  		BUG_ON(!hyperv_pcpu_output_arg);
> > > >  	}
> > > > @@ -435,7 +435,7 @@ int hv_common_cpu_init(unsigned int cpu)
> > > >  	void **inputarg, **outputarg;
> > > >  	u64 msr_vp_index;
> > > >  	gfp_t flags;
> > > > -	int pgcount = hv_root_partition ? 2 : 1;
> > > > +	const int pgcount = (hv_root_partition ||
> > > IS_ENABLED(CONFIG_HYPERV_VTL_MODE)) ? 2 : 1;
> > > >  	void *mem;
> > > >  	int ret;
> > > >
> > > > @@ -453,7 +453,7 @@ int hv_common_cpu_init(unsigned int cpu)
> > > >  		if (!mem)
> > > >  			return -ENOMEM;
> > > >
> > > > -		if (hv_root_partition) {
> > > > +		if (hv_root_partition || IS_ENABLED(CONFIG_HYPERV_VTL_MODE)) {
> > > >  			outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> > > >  			*outputarg = (char *)mem + HV_HYP_PAGE_SIZE;
> > > >  		}
> > > > --
> > > > 2.34.1
> > > >





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux