Re: [PATCH] xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 12, 2012 at 11:01:49AM +0000, Jan Beulich wrote:
> >>> On 10.03.12 at 17:05, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote:
> > [v7: Made it a semi-cpufreq scaling type driver suggested by Jan Beulich 
> 
> I don't see any registration or other hooking up that would match this
> piece of the description.

Oh, a little stale. Should have said : "suggestion by Jan Beulich made me
think about not depending on CPU freq scaling drivers at all."
> 
> > --- a/drivers/xen/Kconfig
> > +++ b/drivers/xen/Kconfig
> > @@ -178,4 +178,21 @@ config XEN_PRIVCMD
> >  	depends on XEN
> >  	default m
> >  
> > +config XEN_ACPI_PROCESSOR
> > +	tristate "Xen ACPI processor"
> > +	depends on XEN && X86 && ACPI_PROCESSOR
> > +	default y if (X86_ACPI_CPUFREQ = y || X86_POWERNOW_K8 = y)
> > +	default m if (X86_ACPI_CPUFREQ = m || X86_POWERNOW_K8 = m)
> 
> I don't think the code itself has any dependencies on this, and as long
> as you interface directly with the "normal" ACPI processor driver there
> also shouldn't be any implicit dependency. Is this hence may just a
> leftover?

It is needed so that the CPU freq scaling drivers don't get called
before this driver is loaded.

Being that if the kernel was compiled with powernow-k8/acpi-cpufreq as
built in, we really want to be loaded before them - to thwart them
from loading. The one way this is done is by calling
 acpi_processor_notify_smm

which on subsequent calls will return -EBUSY for the cpufreq scaling drivers
and inhibit them from being called. Naturally it does not fix the case
if the drivers are built as modules - then the change to load xen-acpi-processor
should be done in the same init script as the one loading powernow-k8/acpi-cpufreq.,

It would be much easier if there was a cpuidle_disable variant in the cpufreq
scaling department - let me see if Dave Jones is open for this, as I can't
see an easy /clean way to make the cpuid checks in the acpi-cpufreq and powernow-k8
changed so that said drivers won't load.


> 
> > +	help
> > +          This ACPI processor uploads Power Management information to the Xen hypervisor.
> > +
> > +	  To do that the driver parses the Power Management data and uploads said
> > +	  information to the Xen hypervisor. Then the Xen hypervisor can select the
> > +          proper Cx and Pxx states. It also registers itslef as the SMM so that
> > +          other drivers (such as ACPI cpufreq scaling driver) will not load.
> > +
> > +          To compile this driver as a module, choose M here: the
> > +          module will be called xen_acpi_processor  If you do not know what to choose,
> > +          select M here. If the CPUFREQ drivers are built in, select Y here.
> > +
> >  endmenu
> > --- /dev/null
> > +++ b/drivers/xen/xen-acpi-processor.c
> > +#define NR_ACPI_CPUS	NR_CPUS
> > +#define MAX_ACPI_BITS	(BITS_TO_LONGS(NR_ACPI_CPUS))
> 
> Rather than arbitrarily limiting this, you could call into Xen at startup
> and scan over all CPUs' ACPI IDs to find the maximum. I think that
> would even cover statically declared hotplug ones, but adding some
> slack may still be necessary to cover dynamic hotplug ones (implying
> that pCPU hotplug patches will make it in at some point).

This would be the XENPF_get_cpuinfo call right? I do plan on looking at the
pCPU hotplug, but I think I need some fancy hardware to test it correctly?

> 
> >...
> > +static int __init check_acpi_ids(struct acpi_processor *pr_backup)
> > +{
> > +	if (!pr_backup)
> > +		return -ENODEV;
> > +
> > +	/* All online CPUs have been processed at this stage. Now verify
> > +	 * whether in fact "online CPUs" == physical CPUs.
> > +	 */
> > +	acpi_id_present = kcalloc(MAX_ACPI_BITS, sizeof(unsigned long), GFP_KERNEL);
> > +	if (!acpi_id_present)
> > +		return -ENOMEM;
> > +
> > +	memset(acpi_id_present, 0, MAX_ACPI_BITS * sizeof(unsigned long));
> 
> kcalloc() already clears the allocated memory.
> 
> > +
> > +	acpi_id_cst_present = kcalloc(MAX_ACPI_BITS, sizeof(unsigned long), GFP_KERNEL);
> > +	if (!acpi_id_cst_present) {
> > +		kfree(acpi_id_present);
> > +		return -ENOMEM;
> > +	}
> > +	memset(acpi_id_cst_present, 0, MAX_ACPI_BITS * sizeof(unsigned long));
> 
> Same here.

Ah yes, one of those paste-n-copy errors.
> 
> > +
> > +	acpi_walk_namespace(ACPI_TYPE_PROCESSOR, ACPI_ROOT_OBJECT,
> > +			    ACPI_UINT32_MAX,
> > +			    read_acpi_id, NULL, NULL, NULL);
> > +	acpi_get_devices("ACPI0007", read_acpi_id, NULL, NULL);
> > +
> > +	if (!bitmap_equal(acpi_id_present, acpi_ids_done, MAX_ACPI_BITS)) {
> > +		unsigned int i;
> > +
> > +		for_each_set_bit(i, acpi_id_present, MAX_ACPI_BITS) {
> > +			pr_backup->acpi_id = i;
> > +			/* Mask out C-states if there are no _CST */
> > +			pr_backup->flags.power = test_bit(i, acpi_id_cst_present);
> > +			(void)upload_pm_data(pr_backup);
> > +		}
> > +	}
> > +	kfree(acpi_id_present);
> > +	acpi_id_present = NULL;
> > +	kfree(acpi_id_cst_present);
> > +	acpi_id_cst_present = NULL;
> > +	return 0;
> > +}
> > +static int __init check_prereq(void)
> > +{
> > +	struct cpuinfo_x86 *c = &cpu_data(0);
> > +
> > +	if (!xen_initial_domain())
> > +		return -ENODEV;
> > +
> > +	if (!acpi_gbl_FADT.smi_command)
> > +		return -ENODEV;
> > +
> > +	if (c->x86_vendor == X86_VENDOR_INTEL) {
> > +		if (!cpu_has(c, X86_FEATURE_EST))
> > +			return -ENODEV;
> > +
> > +		return 0;
> > +	}
> > +	if (c->x86_vendor == X86_VENDOR_AMD) {
> > +		/* Copied from powernow-k8.h, can't include ../xen-acpi-processor/powernow
> > +		 * as we get compile warnings for the static functions.
> > +		 */
> 
> If drivers/cpufreq/powernow-k[78].h are supposed to be useful at all
> (which they currently aren't as both get included in just a single place),
> those static function declarations should be removed from the latter.

<nods> Will post a seperate patch for that.
> 
> > +#define CPUID_FREQ_VOLT_CAPABILITIES    0x80000007
> > +#define USE_HW_PSTATE                   0x00000080
> > +		u32 eax, ebx, ecx, edx;
> > +		cpuid(CPUID_FREQ_VOLT_CAPABILITIES, &eax, &ebx, &ecx, &edx);
> > +		if ((edx & USE_HW_PSTATE) != USE_HW_PSTATE)
> > +			return -ENODEV;
> > +		return 0;
> > +	}
> > +	return -ENODEV;
> > +}
> >...
> > +static int __init xen_acpi_processor_init(void)
> > +{
> > +	struct acpi_processor *pr_backup = NULL;
> > +	unsigned int i;
> > +	int rc = check_prereq();
> > +
> > +	if (rc)
> > +		return rc;
> > +
> > +	acpi_ids_done = kcalloc(MAX_ACPI_BITS, sizeof(unsigned long), GFP_KERNEL);
> > +	if (!acpi_ids_done)
> > +		return -ENOMEM;
> > +
> > +	memset(acpi_ids_done, 0, MAX_ACPI_BITS * sizeof(unsigned long));
> 
> Pointless kcalloc() + memset() again.
> 
> > +
> > +	acpi_perf_data = alloc_percpu(struct acpi_processor_performance);
> > +	if (!acpi_perf_data) {
> > +		pr_debug(DRV_NAME "Memory allocation error for acpi_perf_data.\n");
> > +		kfree(acpi_ids_done);
> > +		return -ENOMEM;
> > +	}
> > +	for_each_possible_cpu(i) {
> > +		if (!zalloc_cpumask_var_node(
> > +			&per_cpu_ptr(acpi_perf_data, i)->shared_cpu_map,
> > +			GFP_KERNEL, cpu_to_node(i))) {
> > +			rc = -ENOMEM;
> > +			goto err_out;
> > +		}
> > +	}
> > +
> > +	/* Do initialization in ACPI core */
> > +	rc = acpi_processor_preregister_performance(acpi_perf_data);
> > +	if (WARN_ON(rc))
> > +		goto err_out;
> > +
> > +	for_each_possible_cpu(i) {
> > +		struct acpi_processor_performance *perf;
> > +
> > +		perf = per_cpu_ptr(acpi_perf_data, i);
> > +		rc = acpi_processor_register_performance(perf, i);
> > +		if (WARN_ON(rc))
> > +			goto err_out;
> > +	}
> > +	rc = acpi_processor_notify_smm(THIS_MODULE);
> > +	if (WARN_ON(rc))
> > +		goto err_unregister;
> > +
> > +	for_each_possible_cpu(i) {
> > +		struct acpi_processor *_pr;
> > +		_pr = per_cpu(processors, i /* APIC ID */);
> > +		if (!_pr)
> > +			continue;
> > +
> > +		if (!pr_backup) {
> > +			pr_backup = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
> > +			memcpy(pr_backup, _pr, sizeof(struct acpi_processor));
> > +		}
> > +		(void)upload_pm_data(_pr);
> > +	}
> > +	rc = check_acpi_ids(pr_backup);
> > +	if (rc)
> > +		goto err_unregister;
> > +
> > +	kfree(pr_backup);
> > +	register_hotcpu_notifier(&xen_cpu_notifier);
> 
> This is pointless - you'd get notified of vCPU-s arrival/departure only.
> Without pCPU hotplug code in place, there's just nothing you can (and
> need to) do here.

I hadn't actually looked in details on the pCPU hotplug code to see how it works.

I presume I need special hardware for this to work as well as the ACPI
is involved in triggering the hotplug CPU up calls?

> 
> > +
> > +	return 0;
> > +err_unregister:
> > +	for_each_possible_cpu(i) {
> > +		struct acpi_processor_performance *perf;
> > +		perf = per_cpu_ptr(acpi_perf_data, i);
> > +		acpi_processor_unregister_performance(perf, i);
> > +	}
> > +err_out:
> > +	/* Freeing a NULL pointer is OK: alloc_percpu zeroes. */
> > +	free_acpi_perf_data();
> > +	kfree(acpi_ids_done);
> > +	return rc;
> > +}
> 
> Overall this version looks a lot better to me than the previous ones.

Yeah, and it drops the dependency on those Xen patches I had sent earlier.
Thanks for taking a look!

> 
> Jan
--
To unsubscribe from this list: send the line "unsubscribe cpufreq" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Devel]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Forum]     [Linux SCSI]

  Powered by Linux