On Wed, Jul 3, 2024 at 11:43 PM Isaac J. Manjarres <isaacmanjarres@xxxxxxxxxx> wrote: > > From: Manish Varma <varmam@xxxxxxxxxx> > > We'll often see aborted suspend operations that look like: > > PM: suspend entry 2024-07-03 15:55:15.372419634 UTC > PM: PM: Pending Wakeup Sources: [timerfd] > Abort: Pending Wakeup Sources: [timerfd] > PM: suspend exit 2024-07-03 15:55:15.445281857 UTC > > From this, it seems a timerfd caused the abort, but that can be > confusing, as timerfds don't create wakeup sources. However, > eventpoll can, and when it does, it names them after the underlying > file descriptor. Unfortunately, all the file descriptors are called > "[timerfd]", and a system may have many timerfds, so this isn't very > useful to debug what's going on to cause the suspend to abort. > > To improve this, change the way eventpoll wakeup sources are named: > > 1) The top-level per-process eventpoll wakeup source is now named > "epollN:P" (instead of just "eventpoll"), where N is a unique ID token, > and P is the PID of the creating process. > > 2) Individual eventpoll item wakeup sources are now named > "epollitemN:P.F", where N is a unique ID token, P is PID of the creating > process, and F is the name of the underlying file descriptor. > > Now, when the scenario described earlier is encountered, the following > kernel logs are emitted: > > PM: suspend entry 2024-07-03 15:39:24.945791824 UTC > PM: PM: Pending Wakeup Sources: epollitem30:6375.[timerfd] > Abort: Pending Wakeup Sources: epollitem30:6375.[timerfd] > PM: suspend exit 2024-07-03 15:39:25.017775019 UTC > > There are various benefits to this new naming convention: > > 1) It is clear that the wakeup source is linked to an eventpoll > item. > > 2) Now that the PID of the process associated with that timerfd > instance is known, it is easy to map the PID of the process to the > name of the process. With this information, it is easy to start > debugging which process is causing this issue to occur. > > 3) Even if process 6375 creates multiple timerfd instances, the > ID token is useful in identifying which timerfd instance associated > with the process is causing suspend to abort, as it is monotonically > increasing. So if the order in which the timerfd instances for the > process is known, then one can pinpoint which timerfd instance is > causing this issue. > > Co-developed-by: Kelly Rossmoyer <krossmo@xxxxxxxxxx> > Signed-off-by: Kelly Rossmoyer <krossmo@xxxxxxxxxx> > Signed-off-by: Manish Varma <varmam@xxxxxxxxxx> > Co-developed-by: Isaac J. Manjarres <isaacmanjarres@xxxxxxxxxx> > Signed-off-by: Isaac J. Manjarres <isaacmanjarres@xxxxxxxxxx> > --- > drivers/base/power/wakeup.c | 12 +++++++++--- For the changes in wakeup.c Acked-by: Rafael J. Wysocki <rafael@xxxxxxxxxx> > fs/eventpoll.c | 11 +++++++++-- > include/linux/pm_wakeup.h | 8 ++++---- > 3 files changed, 22 insertions(+), 9 deletions(-) > > v1 -> v2: > - Renamed instance count to wakesource_create_id to better describe > its purpose. > - Changed the wakeup source naming convention for wakeup sources > created by eventpoll to avoid changing the timerfd names. > - Used the PID of the process instead of the process name for the > sake of uniqueness when creating wakeup sources. > > v2 -> v3: > - Changed wakeup_source_register() to take in a format string > and arguments to avoid duplicating code to construct wakeup > source names. > - Moved the definition of wakesource_create_id so that it is > always defined to fix an compilation error. > > v3 -> v4: > - Changed the naming convention for the top-level epoll wakeup > sources to include an ID for uniqueness. This is needed in > cases where a process is using two epoll fds. > - Edited commit log to reflect new changes and add new tags. > > v4 -> v5: > - Added the format attribute to the wakeup_source_register() > function to address a warning from the kernel test robot: > https://lore.kernel.org/all/202406050504.UvdlPAQ0-lkp@xxxxxxxxx/ > > v5 -> v6: > - Reworded the commit text to clarify the scenarios in which this > patch is helpful, as per feedback from > John Stultz <jstultz@xxxxxxxxxx> > > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c > index 752b417e8129..04a808607b62 100644 > --- a/drivers/base/power/wakeup.c > +++ b/drivers/base/power/wakeup.c > @@ -209,13 +209,19 @@ EXPORT_SYMBOL_GPL(wakeup_source_remove); > /** > * wakeup_source_register - Create wakeup source and add it to the list. > * @dev: Device this wakeup source is associated with (or NULL if virtual). > - * @name: Name of the wakeup source to register. > + * @fmt: format string for the wakeup source name > */ > -struct wakeup_source *wakeup_source_register(struct device *dev, > - const char *name) > +__printf(2, 3) struct wakeup_source *wakeup_source_register(struct device *dev, > + const char *fmt, ...) > { > struct wakeup_source *ws; > int ret; > + char name[128]; > + va_list args; > + > + va_start(args, fmt); > + vsnprintf(name, sizeof(name), fmt, args); > + va_end(args); > > ws = wakeup_source_create(name); > if (ws) { > diff --git a/fs/eventpoll.c b/fs/eventpoll.c > index f53ca4f7fced..941df15208a4 100644 > --- a/fs/eventpoll.c > +++ b/fs/eventpoll.c > @@ -338,6 +338,7 @@ static void __init epoll_sysctls_init(void) > #define epoll_sysctls_init() do { } while (0) > #endif /* CONFIG_SYSCTL */ > > +static atomic_t wakesource_create_id = ATOMIC_INIT(0); > static const struct file_operations eventpoll_fops; > > static inline int is_file_epoll(struct file *f) > @@ -1545,15 +1546,21 @@ static int ep_create_wakeup_source(struct epitem *epi) > { > struct name_snapshot n; > struct wakeup_source *ws; > + pid_t task_pid; > + int id; > + > + task_pid = task_pid_nr(current); > > if (!epi->ep->ws) { > - epi->ep->ws = wakeup_source_register(NULL, "eventpoll"); > + id = atomic_inc_return(&wakesource_create_id); > + epi->ep->ws = wakeup_source_register(NULL, "epoll:%d:%d", id, task_pid); > if (!epi->ep->ws) > return -ENOMEM; > } > > + id = atomic_inc_return(&wakesource_create_id); > take_dentry_name_snapshot(&n, epi->ffd.file->f_path.dentry); > - ws = wakeup_source_register(NULL, n.name.name); > + ws = wakeup_source_register(NULL, "epollitem%d:%d.%s", id, task_pid, n.name.name); > release_dentry_name_snapshot(&n); > > if (!ws) > diff --git a/include/linux/pm_wakeup.h b/include/linux/pm_wakeup.h > index 76cd1f9f1365..1fb6dca981c2 100644 > --- a/include/linux/pm_wakeup.h > +++ b/include/linux/pm_wakeup.h > @@ -99,8 +99,8 @@ extern struct wakeup_source *wakeup_source_create(const char *name); > extern void wakeup_source_destroy(struct wakeup_source *ws); > extern void wakeup_source_add(struct wakeup_source *ws); > extern void wakeup_source_remove(struct wakeup_source *ws); > -extern struct wakeup_source *wakeup_source_register(struct device *dev, > - const char *name); > +extern __printf(2, 3) struct wakeup_source *wakeup_source_register(struct device *dev, > + const char *fmt, ...); > extern void wakeup_source_unregister(struct wakeup_source *ws); > extern int wakeup_sources_read_lock(void); > extern void wakeup_sources_read_unlock(int idx); > @@ -140,8 +140,8 @@ static inline void wakeup_source_add(struct wakeup_source *ws) {} > > static inline void wakeup_source_remove(struct wakeup_source *ws) {} > > -static inline struct wakeup_source *wakeup_source_register(struct device *dev, > - const char *name) > +static inline __printf(2, 3) struct wakeup_source *wakeup_source_register(struct device *dev, > + const char *fmt, ...) > { > return NULL; > } > -- > 2.45.2.803.g4e1b14247a-goog >