Re: A clocksource driver for HyperV

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




>>> On 4/5/2010 at  5:36 PM, in message <  4BBA57FB.2000406@xxxxxxxx  >, Jeremy
Fitzhardinge <  jeremy@xxxxxxxx  > wrote: 
> On 04/05/2010 01:30 PM, Ky Srinivasan wrote:
>> +static cycle_t read_hv_clock(struct clocksource *arg)
>> +{
>> +	cycle_t current_tick;
>> +	/*
>> +	 * Read the partition counter to get the current tick count. This count
>> +	 * is set to 0 when the partition is created and is incremented in
>> +	 * 100 nanosecond units.
>> +	 */
>> +	rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
>> +	return current_tick;
>> +}
>> +
>> +static struct clocksource clocksource_hyperv = {
>> +	.name           = "hyperv_clocksource",
>>    
> 
> Seems like a redundantly long name; any use of this string is going to 
> be in a context where it is obviously a clocksource.  How about just 
> "hyperv"


> 
>> +	.rating         = 400, /* use this when running on Hyperv*/
>> +	.read           = read_hv_clock,
>> +	.mask           = CLOCKSOURCE_MASK(64),
>> +	.shift          = HV_CLOCK_SHIFT,
>> +};
>> +
>> +static struct dmi_system_id __initconst
>> +hv_timesource_dmi_table[] __maybe_unused  = {
>> +	{
>> +		.ident = "Hyper-V",
>> +		.matches = {
>> +			DMI_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
>> +			DMI_MATCH(DMI_PRODUCT_NAME, "Virtual Machine"),
>> +			DMI_MATCH(DMI_BOARD_NAME, "Virtual Machine"),
>> +		},
>> +	},
>> +	{ },
>> +};
>> +MODULE_DEVICE_TABLE(dmi, hv_timesource_dmi_table);
>>    
> 
> So you use the DMI signatures to determine whether the module is needed, 
> but cpuid to work out if the feature is present?
> 
>> +
>> +static struct pci_device_id __initconst
>> +hv_timesource_pci_table[] __maybe_unused = {
>> +	{ PCI_DEVICE(0x1414, 0x5353) }, /* VGA compatible controller */
>> +	{ 0 }
>> +};
>> +MODULE_DEVICE_TABLE(pci, hv_timesource_pci_table);
>>    
> 
> And/or PCI?
> 
> Seems a bit... ad-hoc?  Is this the official way to determine the 
> presence of Hyper-V?

The presence of HyperV and our ability to use the partition-wide counter obviously is checked via probing the cpuid leaves. The DMI/PCI signatures are used in auto-loading these modules.
> 
>> +
>> +
>> +static int __init hv_detect_hyperv(void)
>>    
> 
> This looks generally useful.  Should it be hidden away in the 
> clocksource driver, or in some common hyper-v code?  Do other hyper-v 
> drivers have versions of this?
Good point. Right now, I can think of  multiple drivers replicating this code. We could include hyperV detection code in cpu/hypervisor.c . I will spin up a patch for doing that shortly. 
> 
>> +{
>> +	u32 eax, ebx, ecx, edx;
>> +	static char hyp_signature[20];
>>    
> 20?  static?
> 
>> +
>> +	cpuid(1,&eax,&ebx,&ecx,&edx);
>> +	if (!(ecx&  HV_HYPERVISOR_PRESENT_BIT)) {
>> +		printk(KERN_WARNING
>> +			"Not on a Hypervisor\n");
>>    
> This just looks like noise, especially since it doesn't identify what is 
> generating the message.  And if you compile this code in as =y 
> (non-modular) then it will complain every boot.
> 
>> +		return 1;
>> +	}
>> +	cpuid(HV_CPUID_SIGNATURE,&eax,&ebx,&ecx,&edx);
>> +	*(u32 *)(hyp_signature + 0) = ebx;
>> +	*(u32 *)(hyp_signature + 4) = ecx;
>> +	*(u32 *)(hyp_signature + 8) = edx;
>> +	hyp_signature[12] = 0;
>> +
>> +	if ((eax<  HV_CPUID_MIN) || (strcmp("Microsoft Hv", hyp_signature))) {
>>    
> 
> memcmp, surely?
> 
>> +		printk(KERN_WARNING
>> +			"Not on HyperV; signature %s, eax %x\n",
>> +			hyp_signature, eax);
>> +		return 1;
>> +	}
>> +	/*
>> +	 * Extract the features, recommendations etc.
>> +	 */
>> +	cpuid(HV_CPUID_FEATURES,&eax,&ebx,&ecx,&edx);
>> +	if (!(eax&  0x10)) {
>> +		printk(KERN_WARNING "HyperV Time Ref Counter not available!\n");
>> +		return 1;
>> +	}
>> +
>> +	cpuid(HV_CPUID_RECOMMENDATIONS,&eax,&ebx,&ecx,&edx);
>> +	printk(KERN_INFO "HyperV recommendations: %x\n", eax);
>> +	printk(KERN_INFO "HyperV spin count: %x\n", ebx);
>> +	return 0;
>> +}
>> +
>> +
>> +static int __init init_hv_clocksource(void)
>> +{
>> +	if (hv_detect_hyperv())
>> +		return -ENODEV;


>> +	/*
>> +	 * The time ref counter in HyperV is in 100ns units.
>> +	 * The definition of mult is:
>> +	 * mult/2^shift = ns/cyc = 100
>> +	 * mult = (100<<  shift)
>> +	 */
>> +	clocksource_hyperv.mult = (100<<  HV_CLOCK_SHIFT);
>>    
> 
> Why not initialize this in the structure?  It's just 100<<22 isn't it?
> 
>> +	printk(KERN_INFO "Registering HyperV clock source\n");
>> +	return clocksource_register(&clocksource_hyperv);
>> +}
>> +
>> +module_init(init_hv_clocksource);
>> +MODULE_DESCRIPTION("HyperV based clocksource");
>> +MODULE_AUTHOR("K. Y. Srinivasan<  ksrinivasan@xxxxxxxxxx  >");
>> +MODULE_LICENSE("GPL");
>> Index: linux/drivers/staging/hv/Makefile
>> ===================================================================
>> --- linux.orig/drivers/staging/hv/Makefile	2010-04-05 13:02:06.000000000 -0600
>> +++ linux/drivers/staging/hv/Makefile	2010-04-05 13:02:13.000000000 -0600
>> @@ -1,4 +1,4 @@
>> -obj-$(CONFIG_HYPERV)		+= hv_vmbus.o
>> +obj-$(CONFIG_HYPERV)		+= hv_vmbus.o hv_timesource.o
>>   obj-$(CONFIG_HYPERV_STORAGE)	+= hv_storvsc.o
>>   obj-$(CONFIG_HYPERV_BLOCK)	+= hv_blkvsc.o
>>   obj-$(CONFIG_HYPERV_NET)	+= hv_netvsc.o
>>    

Jeremy, thank you for your comments. I am attaching the next version of this patch that addresses the comments I have gotten thus far.  

Regards,

K. Y 

From: K. Y. Srinivasan <ksrinivasan@xxxxxxxxxx>
Subject:  A clocksource for Linux guests hosted on HyperV.
References: None
Patch-mainline: 

This patch is a clocksource implementation suitable for guests hosted on HyperV.
Time keeping in Linux guests hosted on HyperV is unstable. This clocksource 
driver fixes the problem. 

Signed-off-by: K. Y. Srinivasan <ksrinivasan@xxxxxxxxxx>

Index: linux/drivers/staging/hv/hv_timesource.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux/drivers/staging/hv/hv_timesource.c	2010-04-07 12:17:29.000000000 -0600
@@ -0,0 +1,147 @@
+/*
+ * A clocksource for Linux running on HyperV.
+ *
+ *
+ * Copyright (C) 2010, Novell, Inc.
+ * Author : K. Y. Srinivasan <ksrinivasan@xxxxxxxxxx>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT.  See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ */
+
+#include <linux/version.h>
+#include <linux/clocksource.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/dmi.h>
+
+#define HV_CLOCK_SHIFT	22
+/*
+ * HyperV defined synthetic CPUID leaves:
+ */
+#define HV_CPUID_SIGNATURE	0x40000000
+#define HV_CPUID_MIN		0x40000005
+#define HV_HYPERVISOR_PRESENT_BIT	0x80000000
+#define HV_CPUID_FEATURES	0x40000003
+#define HV_CPUID_RECOMMENDATIONS	0x40000004
+
+/*
+ * HyperV defined synthetic MSRs
+ */
+
+#define HV_X64_MSR_TIME_REF_COUNT	0x40000020
+
+
+static cycle_t read_hv_clock(struct clocksource *arg)
+{
+	cycle_t current_tick;
+	/*
+	 * Read the partition counter to get the current tick count. This count
+	 * is set to 0 when the partition is created and is incremented in
+	 * 100 nanosecond units.
+	 */
+	rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
+	return current_tick;
+}
+
+static struct clocksource hyperv_cs = {
+	.name           = "hyperv_clocksource",
+	.rating         = 400, /* use this when running on Hyperv*/
+	.read           = read_hv_clock,
+	.mask           = CLOCKSOURCE_MASK(64),
+	/*
+	 * The time ref counter in HyperV is in 100ns units.
+	 * The definition of mult is:
+	 * mult/2^shift = ns/cyc = 100
+	 * mult = (100 << shift)
+	 */
+	.mult           = (100 << HV_CLOCK_SHIFT),
+	.shift          = HV_CLOCK_SHIFT,
+};
+
+static const struct dmi_system_id __initconst
+hv_timesource_dmi_table[] __maybe_unused  = {
+	{
+		.ident = "Hyper-V",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
+			DMI_MATCH(DMI_PRODUCT_NAME, "Virtual Machine"),
+			DMI_MATCH(DMI_BOARD_NAME, "Virtual Machine"),
+		},
+	},
+	{ },
+};
+MODULE_DEVICE_TABLE(dmi, hv_timesource_dmi_table);
+
+static const struct pci_device_id __initconst
+hv_timesource_pci_table[] __maybe_unused = {
+	{ PCI_DEVICE(0x1414, 0x5353) }, /* VGA compatible controller */
+	{ 0 }
+};
+MODULE_DEVICE_TABLE(pci, hv_timesource_pci_table);
+
+
+static int __init hv_detect_hyperv(void)
+{
+	u32 eax, ebx, ecx, edx;
+	char hyp_signature[20];
+
+	cpuid(1, &eax, &ebx, &ecx, &edx);
+
+	if (!(ecx & HV_HYPERVISOR_PRESENT_BIT))
+		return 1;
+
+	cpuid(HV_CPUID_SIGNATURE, &eax, &ebx, &ecx, &edx);
+	*(u32 *)(hyp_signature + 0) = ebx;
+	*(u32 *)(hyp_signature + 4) = ecx;
+	*(u32 *)(hyp_signature + 8) = edx;
+
+	if ((eax < HV_CPUID_MIN) ||
+	    (memcmp("Microsoft Hv", hyp_signature, 12))) {
+		printk(KERN_WARNING
+			"Not on HyperV; signature %s, eax %x\n",
+			hyp_signature, eax);
+		return 1;
+	}
+	/*
+	 * Extract the features, recommendations etc.
+	 */
+	cpuid(HV_CPUID_FEATURES, &eax, &ebx, &ecx, &edx);
+	if (!(eax & 0x10)) {
+		printk(KERN_WARNING "HyperV Time Ref Counter not available!\n");
+		return 1;
+	}
+
+	cpuid(HV_CPUID_RECOMMENDATIONS, &eax, &ebx, &ecx, &edx);
+	printk(KERN_INFO "HyperV recommendations: %x\n", eax);
+	printk(KERN_INFO "HyperV spin count: %x\n", ebx);
+	return 0;
+}
+
+
+static int __init init_hv_clocksource(void)
+{
+	if (hv_detect_hyperv())
+		return -ENODEV;
+	printk(KERN_INFO "Registering HyperV clock source\n");
+	return clocksource_register(&hyperv_cs);
+}
+
+module_init(init_hv_clocksource);
+MODULE_DESCRIPTION("HyperV based clocksource");
+MODULE_AUTHOR("K. Y. Srinivasan <ksrinivasan@xxxxxxxxxx>");
+MODULE_LICENSE("GPL");
Index: linux/drivers/staging/hv/Makefile
===================================================================
--- linux.orig/drivers/staging/hv/Makefile	2010-04-07 12:17:25.000000000 -0600
+++ linux/drivers/staging/hv/Makefile	2010-04-07 12:17:29.000000000 -0600
@@ -1,4 +1,4 @@
-obj-$(CONFIG_HYPERV)		+= hv_vmbus.o
+obj-$(CONFIG_HYPERV)		+= hv_vmbus.o hv_timesource.o
 obj-$(CONFIG_HYPERV_STORAGE)	+= hv_storvsc.o
 obj-$(CONFIG_HYPERV_BLOCK)	+= hv_blkvsc.o
 obj-$(CONFIG_HYPERV_NET)	+= hv_netvsc.o
_______________________________________________
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxx
http://driverdev.linuxdriverproject.org/mailman/listinfo/devel

[Index of Archives]     [Linux Driver Backports]     [DMA Engine]     [Linux GPIO]     [Linux SPI]     [Video for Linux]     [Linux USB Devel]     [Linux Coverity]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]
  Powered by Linux