Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes: > On Tue, 31 Oct 2017, mikelley@xxxxxxxxxxxxxxxxxxxxxx wrote: > > diff --git a/arch/x86/include/uapi/asm/hyperv.h > > b/arch/x86/include/uapi/asm/hyperv.h > > index f65d125..408cf3e 100644 > > --- a/arch/x86/include/uapi/asm/hyperv.h > > +++ b/arch/x86/include/uapi/asm/hyperv.h > > @@ -112,6 +112,22 @@ > > #define HV_X64_GUEST_IDLE_STATE_AVAILABLE (1 << 5) > > /* Guest crash data handler available */ > > #define HV_X64_GUEST_CRASH_MSR_AVAILABLE (1 << 10) > > +/* Debug MSRs available */ > > +#define HV_X64_DEBUG_MSR_AVAILABLE (1 << 11) > > +/* Support for Non-Privileged Instruction Execution Prevention is available */ > > +#define HV_X64_NPIEP_AVAILABLE (1 << 12) > > +/* Support for DisableHypervisor is available */ > > +#define HV_X64_DISABLE_HYPERVISOR_AVAILABLE (1 << 13) > > +/* Extended GVA Ranges for Flush Virtual Address list is available */ > > +#define HV_X64_EXTENDED_GVA_RANGE_AVAILABLE (1 << 14) > > +/* Return Hypercall output via XMM registers is available */ > > +#define HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLE (1 << 15) > > +/* SINT polling mode available */ > > +#define HV_X64_SINT_POLLING_MODE_AVAILABLE (1 << 17) > > +/* Hypercall MSR lock is available */ > > +#define HV_X64_HYPERCALL_MSR_LOCK_AVAILABLE (1 << 18) > > +/* stimer direct mode is available */ > > +#define HV_X64_STIMER_DIRECT_MODE_AVAILABLE (1 << 19) > > How are all these defines (except the last one related to that patch? Will move to a separate patch. > > > +/* Hardware IRQ number to use for stimer0 in Direct Mode. This IRQ > > +is a fake > > + * because stimer's in Direct Mode simply interrupt on the specified > > +vector, > > + * without using a particular IOAPIC pin. But we use the IRQ > > +allocation > > + * machinery, so we need a hardware IRQ #. This value is somewhat > > +arbitrary, > > + * but it should not be a legacy IRQ (0 to 15), and should fit within > > +the > > + * single IOAPIC (0 to 23) that Hyper-V provides to a guest VM. So > > +any value > > + * between 16 and 23 should be good. > > + */ > > +#define HV_STIMER0_IRQNR 18 > > Why would you want abuse an IOAPIC interrupt if all you need is a vector? Allocating a vector up-front like the existing HYPERVISOR_CALLBACK_VECTOR would certainly be more straightforward, and in fact, my original internal testing of stimer Direct Mode used that approach. Vectors are a limited resource, so I wanted to find a way to allocate one on-the-fly rather than use fixed pre-allocation, even under CONFIG_HYPERV. But I've got no problem with allocating a vector up-front and skipping all the irq/APIC manipulation and related issues. Any guidance on which vector to use? > > > +/* Routines to do per-architecture handling of stimer0 when in Direct > > +Mode */ > > + > > +void hv_ack_stimer0_interrupt(struct irq_desc *desc) { > > + ack_APIC_irq(); > > +} > > + > > +static void allonline_vector_allocation_domain(int cpu, struct cpumask *retmask, > > + const struct cpumask *mask) > > +{ > > + cpumask_copy(retmask, cpu_online_mask); } > > + > > +int hv_allocate_stimer0_irq(int *irq, int *vector) { > > + int localirq; > > + int result; > > + struct irq_data *irq_data; > > + > > + /* The normal APIC vector allocation domain allows allocation of > > +vectors > > Please fix you comment style. Multi line comments are: > > /* > * Bla.... > * foo... > */ > Will do. > > + * only for the calling CPU. So we change the allocation domain to one > > + * that allows vectors to be allocated in all online CPUs. This > > + * change is fine in a Hyper-V VM because VMs don't have the usual > > + * complement of interrupting devices. > > + */ > > + apic->vector_allocation_domain = allonline_vector_allocation_domain; > > This does not work anymore. vector_allocation_domain is gone as of 4.15. Please check the > vector rework in > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip x86/apic > > Aside of that what guarantees that all cpus are online at the point where you allocate that > interrupt? Nothing, so the vector will be not reserved or allocated on offline CPUs. Now guess > what happens if you bring the offline CPUs online later, it will drown in spurious interrupts. > Worse, the vector can also be reused for a device interrupt. Great plan. > > > + localirq = acpi_register_gsi(NULL, HV_STIMER0_IRQNR, > > + ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH); > > + if (localirq < 0) { > > + pr_err("Cannot register stimer0 gsi. Error %d", localirq); > > + return -1; > > + } > > + > > + /* We pass in a dummy IRQ handler because architecture independent code > > + * will later override the IRQ domain interrupt handler and set it to a > > + * Hyper-V specific handler. > > + */ > > + result = request_irq(localirq, (irq_handler_t)(-1), 0, > "Hyper-V stimer0", NULL); > > That's a crude hack. Really. > > > + if (result) { > > + pr_err("Cannot request stimer0 irq. Error %d", result); > > + acpi_unregister_gsi(localirq); > > + return -1; > > + } > > + irq_data = irq_domain_get_irq_data(x86_vector_domain, localirq); > > + *vector = irqd_cfg(irq_data)->vector; > > + *irq = localirq; > > Uurgh, no. This is even more of a layering violation. Grab random data from wherever it > comes and then expect that it works. This will simply fall apart the moment someone changes > the affinity of this interrupt. It will move to some random other vector and the system drowns > in spurious interrupts on the old vector. > > > +/* ISR for when stimer0 is operating in Direct Mode. Direct Mode > > +does > > + * not use VMBus or any VMBus messages, so process here and not in > > +the > > + * VMBus driver code. > > + */ > > + > > +static void hv_stimer0_isr(struct irq_desc *desc) { > > + struct hv_per_cpu_context *hv_cpu; > > + > > + __this_cpu_inc(*desc->kstat_irqs); > > + __this_cpu_inc(kstat.irqs_sum); > > + hv_ack_stimer0_interrupt(desc); > > + hv_cpu = this_cpu_ptr(hv_context.cpu_context); > > + hv_cpu->clk_evt->event_handler(hv_cpu->clk_evt); > > + add_interrupt_randomness(desc->irq_data.irq, 0); } > > + > > static int hv_ce_set_next_event(unsigned long delta, > > struct clock_event_device *evt) > > { > > @@ -108,6 +149,8 @@ static int hv_ce_shutdown(struct > > clock_event_device *evt) { > > hv_init_timer(HV_X64_MSR_STIMER0_COUNT, 0); > > hv_init_timer_config(HV_X64_MSR_STIMER0_CONFIG, 0); > > + if (stimer_direct_mode) > > + hv_disable_stimer0_percpu_irq(stimer0_irq); > > Whats the point of that? It's an empty inline: > > > +#if IS_ENABLED(CONFIG_HYPERV) > > +static inline void hv_enable_stimer0_percpu_irq(int irq) { } static > > +inline void hv_disable_stimer0_percpu_irq(int irq) { } We've got code in the works for Hyper-V on ARM64. This is architecture independent code that caters to the ARM64 side where stimer Direct Mode will use percpu IRQs, and hooks are needed to call enable_percpu_irq() and disable_percpu_irq(). > > > + if (stimer_direct_mode) { > > + > > + /* When it expires, the timer will directly interrupt > > + * on the specific hardware vector. > > + */ > > + timer_cfg.direct_mode = 1; > > + timer_cfg.apic_vector = stimer0_vector; > > + hv_enable_stimer0_percpu_irq(stimer0_irq); > > Ditto. > > > + if (stimer_direct_mode) { > > + if (hv_allocate_stimer0_irq(&stimer0_irq, &stimer0_vector)) > > + goto err; > > + irq_set_handler(stimer0_irq, hv_stimer0_isr); > > What you really want to do here is to allocate a fixed vector like we do for the other percpu > interrupts (local apic timer, IPIs etc.) and use that. This must be done at boot time, when you > detect that the kernel runs on HyperV. This makes sure, that all CPUs will have the vector > populated, even those which come online late. Will submit a v2 with that approach. > > Though you can't do that from a module. You have to do that setup during early boot from > ms_hyperv_init_platform(). That's the only sane way to deal with that. > > Thanks, > > tglx _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel