Re: [RFC][PATCH] PM: Avoid losing wakeup events during suspend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jun 20, 2010 at 12:05:35AM +0200, Rafael J. Wysocki wrote:
> Hi,
> 
> One of the arguments during the suspend blockers discussion was that the
> mainline kernel didn't contain any mechanisms allowing it to avoid losing
> wakeup events during system suspend.
> 
> Generally, there are two problems in that area.  First, if a wakeup event
> occurs exactly at the same time when /sys/power/state is being written to,
> the even may be delivered to user space right before the freezing of it,
> in which case the user space consumer of the event may not be able to process
yes this is racy.  souldn't the wakeup event handers/driver force a user
mode ACK before they stop failing suspend attempts? 

> it before the system is suspended.  Second, if a wakeup event occurs after user
> space has been frozen and that event is not a wakeup interrupt, the kernel will
> not react to it and the system will be suspended.

If its not a wakeup interrupt is it not fair to allow the suspend to
happen even if its handler's are "in flight" at suspend time?
> 
> The following patch illustrates my idea of how these two problems may be
> addressed.  It introduces a new global sysfs attribute,
> /sys/power/wakeup_count, associated with a running counter of wakeup events
> and a helper function, pm_wakeup_event(), that may be used by kernel subsystems
> to increment the wakeup events counter.
> 
> /sys/power/wakeup_count may be read from or written to by user space.  Reads
> will always succeed and return the current value of the wakeup events counter.
> Writes, however, will only succeed if the written number is equal to the
> current value of the wakeup events counter.  If a write is successful, it will
> cause the kernel to save the current value of the wakeup events counter and
> to compare the saved number with the current value of the counter at certain
> points of the subsequent suspend (or hibernate) sequence.  If the two values
> don't match, the suspend will be aborted just as though a wakeup interrupt
> happened.  Reading from /sys/power/wakeup_count again will turn that mechanism
> off.

why would you want to turn it off?

> 
> The assumption is that there's a user space power manager that will first
> read from /sys/power/wakeup_count.  Then it will check all user space consumers
> of wakeup events known to it for unprocessed events.  If there are any, it will
> wait for them to be processed and repeat.  In turn, if there are not any,
> it will try to write to /sys/power/wakeup_count and if the write is successful,
> it will write to /sys/power/state to start suspend, so if any wakeup events
> accur past that point, they will be noticed by the kernel and will eventually
> cause the suspend to be aborted.
> 
> In addition to the above, the patch adds a wakeup events counter to the
> power member of struct device and makes these per-device wakeup event counters
> available via sysfs, so that it's possible to check the activity of various
> wakeup event sources within the kernel.
> 
> To illustrate how subsystems can use pm_wakeup_event(), I added it to the
> PCI runtime PM wakeup-handling code.
> 
> At the moment the patch only contains code changes (ie. no documentation),
> but I'm going to add comments etc. if people like the idea.
> 
> Please tell me what you think.
> 
> Rafael
> 
> ---
>  drivers/base/power/Makefile     |    2 -
>  drivers/base/power/main.c       |    1 
>  drivers/base/power/power.h      |    3 +
>  drivers/base/power/sysfs.c      |    9 ++++
>  drivers/base/power/wakeup.c     |   74 ++++++++++++++++++++++++++++++++++++++++
>  drivers/pci/pci-acpi.c          |    2 +
>  drivers/pci/pcie/pme/pcie_pme.c |    2 +
>  include/linux/pm.h              |    6 +++
>  kernel/power/hibernate.c        |   14 ++++---
>  kernel/power/main.c             |   24 ++++++++++++
>  kernel/power/power.h            |    6 +++
>  kernel/power/suspend.c          |    2 -
>  12 files changed, 138 insertions(+), 7 deletions(-)
> 
> Index: linux-2.6/kernel/power/main.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/main.c
> +++ linux-2.6/kernel/power/main.c
> @@ -204,6 +204,28 @@ static ssize_t state_store(struct kobjec
>  
>  power_attr(state);
>  
> +static ssize_t wakeup_count_show(struct kobject *kobj,
> +				struct kobj_attribute *attr,
> +				char *buf)
> +{
> +	return sprintf(buf, "%lu\n", pm_get_wakeup_count());
> +}
> +
> +static ssize_t wakeup_count_store(struct kobject *kobj,
> +				struct kobj_attribute *attr,
> +				const char *buf, size_t n)
> +{
> +	unsigned long val;
> +
> +	if (sscanf(buf, "%lu", &val) == 1) {
> +		if (pm_save_wakeup_count(val))
> +			return n;
> +	}
> +	return -EINVAL;
> +}
> +
> +power_attr(wakeup_count);
> +
>  #ifdef CONFIG_PM_TRACE
>  int pm_trace_enabled;
>  
> @@ -236,6 +258,7 @@ static struct attribute * g[] = {
>  #endif
>  #ifdef CONFIG_PM_SLEEP
>  	&pm_async_attr.attr,
> +	&wakeup_count_attr.attr,
>  #ifdef CONFIG_PM_DEBUG
>  	&pm_test_attr.attr,
>  #endif
> @@ -266,6 +289,7 @@ static int __init pm_init(void)
>  	int error = pm_start_workqueue();
>  	if (error)
>  		return error;
> +	pm_wakeup_events_init();
>  	power_kobj = kobject_create_and_add("power", NULL);
>  	if (!power_kobj)
>  		return -ENOMEM;
> Index: linux-2.6/drivers/base/power/wakeup.c
> ===================================================================
> --- /dev/null
> +++ linux-2.6/drivers/base/power/wakeup.c
> @@ -0,0 +1,74 @@
> +
> +#include <linux/device.h>
> +#include <linux/pm.h>
> +
> +static unsigned long event_count;
> +static unsigned long saved_event_count;

what about over flow issues?

> +static bool events_check_enabled;
> +static spinlock_t events_lock;
> +
> +void pm_wakeup_events_init(void)
> +{
> +	spin_lock_init(&events_lock);
> +}
> +
> +void pm_wakeup_event(struct device *dev)
> +{
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&events_lock, flags);
> +	event_count++;
should event_count be an atomic type so we can not bother with taking
the evnets_lock?

> +	if (dev)
> +		dev->power.wakeup_count++;
> +	spin_unlock_irqrestore(&events_lock, flags);
> +}
> +
> +bool pm_check_wakeup_events(bool enable)
> +{
> +	unsigned long flags;
> +	bool ret;
> +
> +	spin_lock_irqsave(&events_lock, flags);
> +	ret = !events_check_enabled || (event_count == saved_event_count);
I'm not getting the events_check_enbled flag yet.

> +	events_check_enabled = enable;
I'm not sure if this is the right thing depending on all the different
ways the boolians are interacting with eachother.

Which is a red flag to me.  This code is confusing.


I'll look at it some more when I'm fresh tomorrow.

--mgross

> +	spin_unlock_irqrestore(&events_lock, flags);
> +	return ret;
> +}
> +
> +unsigned long pm_get_wakeup_count(void)
> +{
> +	unsigned long flags;
> +	unsigned long count;
> +
> +	spin_lock_irqsave(&events_lock, flags);
> +	events_check_enabled = false;
> +	count = event_count;
> +	spin_unlock_irqrestore(&events_lock, flags);
> +	return count;
> +}
> +
> +bool pm_save_wakeup_count(unsigned long count)
> +{
> +	unsigned long flags;
> +	bool ret = false;
> +
> +	spin_lock_irqsave(&events_lock, flags);
> +	if (count == event_count) {
> +		saved_event_count = count;
> +		events_check_enabled = true;
> +		ret = true;
> +	}
> +	spin_unlock_irqrestore(&events_lock, flags);
> +	return ret;
> +}
> +
> +unsigned long pm_dev_wakeup_count(struct device *dev)
> +{
> +	unsigned long flags;
> +	unsigned long count;
> +
> +	spin_lock_irqsave(&events_lock, flags);
> +	count = dev->power.wakeup_count;
> +	spin_unlock_irqrestore(&events_lock, flags);
> +	return count;
> +}
> Index: linux-2.6/include/linux/pm.h
> ===================================================================
> --- linux-2.6.orig/include/linux/pm.h
> +++ linux-2.6/include/linux/pm.h
> @@ -457,6 +457,7 @@ struct dev_pm_info {
>  #ifdef CONFIG_PM_SLEEP
>  	struct list_head	entry;
>  	struct completion	completion;
> +	unsigned long		wakeup_count;
>  #endif
>  #ifdef CONFIG_PM_RUNTIME
>  	struct timer_list	suspend_timer;
> @@ -552,6 +553,9 @@ extern void __suspend_report_result(cons
>  	} while (0)
>  
>  extern void device_pm_wait_for_dev(struct device *sub, struct device *dev);
> +
> +/* drivers/base/power/wakeup.c */
> +extern void pm_wakeup_event(struct device *dev);
>  #else /* !CONFIG_PM_SLEEP */
>  
>  #define device_pm_lock() do {} while (0)
> @@ -565,6 +569,8 @@ static inline int dpm_suspend_start(pm_m
>  #define suspend_report_result(fn, ret)		do {} while (0)
>  
>  static inline void device_pm_wait_for_dev(struct device *a, struct device *b) {}
> +
> +static inline void pm_wakeup_event(struct device *dev) {}
>  #endif /* !CONFIG_PM_SLEEP */
>  
>  /* How to reorder dpm_list after device_move() */
> Index: linux-2.6/drivers/base/power/Makefile
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/Makefile
> +++ linux-2.6/drivers/base/power/Makefile
> @@ -1,5 +1,5 @@
>  obj-$(CONFIG_PM)	+= sysfs.o
> -obj-$(CONFIG_PM_SLEEP)	+= main.o
> +obj-$(CONFIG_PM_SLEEP)	+= main.o wakeup.o
>  obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
>  obj-$(CONFIG_PM_OPS)	+= generic_ops.o
>  obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
> Index: linux-2.6/drivers/base/power/main.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/main.c
> +++ linux-2.6/drivers/base/power/main.c
> @@ -59,6 +59,7 @@ void device_pm_init(struct device *dev)
>  {
>  	dev->power.status = DPM_ON;
>  	init_completion(&dev->power.completion);
> +	dev->power.wakeup_count = 0;
>  	pm_runtime_init(dev);
>  }
>  
> Index: linux-2.6/kernel/power/power.h
> ===================================================================
> --- linux-2.6.orig/kernel/power/power.h
> +++ linux-2.6/kernel/power/power.h
> @@ -184,6 +184,12 @@ static inline void suspend_test_finish(c
>  #ifdef CONFIG_PM_SLEEP
>  /* kernel/power/main.c */
>  extern int pm_notifier_call_chain(unsigned long val);
> +
> +/* drivers/base/power/wakeup.c */
> +extern void pm_wakeup_events_init(void);
> +extern bool pm_check_wakeup_events(bool enable);
> +extern unsigned long pm_get_wakeup_count(void);
> +extern bool pm_save_wakeup_count(unsigned long count);
>  #endif
>  
>  #ifdef CONFIG_HIGHMEM
> Index: linux-2.6/kernel/power/suspend.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/suspend.c
> +++ linux-2.6/kernel/power/suspend.c
> @@ -157,7 +157,7 @@ static int suspend_enter(suspend_state_t
>  
>  	error = sysdev_suspend(PMSG_SUSPEND);
>  	if (!error) {
> -		if (!suspend_test(TEST_CORE))
> +		if (!suspend_test(TEST_CORE) && pm_check_wakeup_events(false))
>  			error = suspend_ops->enter(state);
>  		sysdev_resume();
>  	}
> Index: linux-2.6/kernel/power/hibernate.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/hibernate.c
> +++ linux-2.6/kernel/power/hibernate.c
> @@ -277,7 +277,7 @@ static int create_image(int platform_mod
>  		goto Enable_irqs;
>  	}
>  
> -	if (hibernation_test(TEST_CORE))
> +	if (hibernation_test(TEST_CORE) || !pm_check_wakeup_events(true))
>  		goto Power_up;
>  
>  	in_suspend = 1;
> @@ -511,14 +511,18 @@ int hibernation_platform_enter(void)
>  
>  	local_irq_disable();
>  	sysdev_suspend(PMSG_HIBERNATE);
> +	if (!pm_check_wakeup_events(false))
> +		goto Power_up;
> +
>  	hibernation_ops->enter();
>  	/* We should never get here */
>  	while (1);
>  
> -	/*
> -	 * We don't need to reenable the nonboot CPUs or resume consoles, since
> -	 * the system is going to be halted anyway.
> -	 */
> + Power_up:
> +	sysdev_resume();
> +	local_irq_enable();
> +	enable_nonboot_cpus();
> +
>   Platform_finish:
>  	hibernation_ops->finish();
>  
> Index: linux-2.6/drivers/pci/pci-acpi.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/pci-acpi.c
> +++ linux-2.6/drivers/pci/pci-acpi.c
> @@ -48,6 +48,8 @@ static void pci_acpi_wake_dev(acpi_handl
>  	if (event == ACPI_NOTIFY_DEVICE_WAKE && pci_dev) {
>  		pci_check_pme_status(pci_dev);
>  		pm_runtime_resume(&pci_dev->dev);
> +		if (device_may_wakeup(&pci_dev->dev))
> +			pm_wakeup_event(&pci_dev->dev);
>  		if (pci_dev->subordinate)
>  			pci_pme_wakeup_bus(pci_dev->subordinate);
>  	}
> Index: linux-2.6/drivers/pci/pcie/pme/pcie_pme.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/pcie/pme/pcie_pme.c
> +++ linux-2.6/drivers/pci/pcie/pme/pcie_pme.c
> @@ -147,6 +147,8 @@ static bool pcie_pme_walk_bus(struct pci
>  		/* Skip PCIe devices in case we started from a root port. */
>  		if (!pci_is_pcie(dev) && pci_check_pme_status(dev)) {
>  			pm_request_resume(&dev->dev);
> +			if (device_may_wakeup(&dev->dev))
> +				pm_wakeup_event(&dev->dev);
>  			ret = true;
>  		}
>  
> Index: linux-2.6/drivers/base/power/power.h
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/power.h
> +++ linux-2.6/drivers/base/power/power.h
> @@ -30,6 +30,9 @@ extern void device_pm_move_before(struct
>  extern void device_pm_move_after(struct device *, struct device *);
>  extern void device_pm_move_last(struct device *);
>  
> +/* drivers/base/power/wakeup.c */
> +extern unsigned long pm_dev_wakeup_count(struct device *dev);
> +
>  #else /* !CONFIG_PM_SLEEP */
>  
>  static inline void device_pm_init(struct device *dev)
> Index: linux-2.6/drivers/base/power/sysfs.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/sysfs.c
> +++ linux-2.6/drivers/base/power/sysfs.c
> @@ -144,6 +144,14 @@ wake_store(struct device * dev, struct d
>  
>  static DEVICE_ATTR(wakeup, 0644, wake_show, wake_store);
>  
> +static ssize_t wakeup_count_show(struct device *dev,
> +				struct device_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%lu\n", pm_dev_wakeup_count(dev));
> +}
> +
> +static DEVICE_ATTR(wakeup_count, 0444, wakeup_count_show, NULL);
> +
>  #ifdef CONFIG_PM_ADVANCED_DEBUG
>  #ifdef CONFIG_PM_RUNTIME
>  
> @@ -230,6 +238,7 @@ static struct attribute * power_attrs[]
>  	&dev_attr_control.attr,
>  #endif
>  	&dev_attr_wakeup.attr,
> +	&dev_attr_wakeup_count.attr,
>  #ifdef CONFIG_PM_ADVANCED_DEBUG
>  	&dev_attr_async.attr,
>  #ifdef CONFIG_PM_RUNTIME
> 
_______________________________________________
linux-pm mailing list
linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/linux-pm


[Index of Archives]     [Linux ACPI]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [CPU Freq]     [Kernel Newbies]     [Fedora Kernel]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux