On 12/9/20 1:51 PM, Georgi Djakov wrote: > From: Liam Mark <lmark@xxxxxxxxxxxxxx> > > Collect the time for each allocation recorded in page owner so that > allocation "surges" can be measured. > > Record the pid for each allocation recorded in page owner so that > the source of allocation "surges" can be better identified. > > The above is very useful when doing memory analysis. On a crash for > example, we can get this information from kdump (or ramdump) and parse > it to figure out memory allocation problems. Yes, I can imagine this to be useful. > Please note that on x86_64 this increases the size of struct page_owner > from 16 bytes to 32. That's the tradeoff, but it's not a functionality intended for production, so unless somebody says they need to enable page_owner for debugging and this increase prevents them from fitting into available memory, let's not complicate things with making this optional. > Signed-off-by: Liam Mark <lmark@xxxxxxxxxxxxxx> > Signed-off-by: Georgi Djakov <georgi.djakov@xxxxxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > --- > > v2: > - Improve the commit message (Andrew and Vlastimil) > - Update page_owner.rst with more recent object size information (Andrew) > - Use pid_t for the pid (Andrew) > - Print the info also in __dump_page_owner() (Vlastimil) > > v1: https://lore.kernel.org/r/20201112184106.733-1-georgi.djakov@xxxxxxxxxx/ > > > Documentation/vm/page_owner.rst | 12 ++++++------ > mm/page_owner.c | 17 +++++++++++++---- > 2 files changed, 19 insertions(+), 10 deletions(-) > > diff --git a/Documentation/vm/page_owner.rst b/Documentation/vm/page_owner.rst > index 02deac76673f..cf7c0c361621 100644 > --- a/Documentation/vm/page_owner.rst > +++ b/Documentation/vm/page_owner.rst > @@ -41,17 +41,17 @@ size change due to this facility. > - Without page owner:: > > text data bss dec hex filename > - 40662 1493 644 42799 a72f mm/page_alloc.o > + 48392 2333 644 51369 c8a9 mm/page_alloc.o > > - With page owner:: > > text data bss dec hex filename > - 40892 1493 644 43029 a815 mm/page_alloc.o > - 1427 24 8 1459 5b3 mm/page_ext.o > - 2722 50 0 2772 ad4 mm/page_owner.o > + 48800 2445 644 51889 cab1 mm/page_alloc.o > + 6574 108 29 6711 1a37 mm/page_owner.o > + 1025 8 8 1041 411 mm/page_ext.o > > -Although, roughly, 4 KB code is added in total, page_alloc.o increase by > -230 bytes and only half of it is in hotpath. Building the kernel with > +Although, roughly, 8 KB code is added in total, page_alloc.o increase by > +520 bytes and less than half of it is in hotpath. Building the kernel with > page owner and turning it on if needed would be great option to debug > kernel memory problem. > > diff --git a/mm/page_owner.c b/mm/page_owner.c > index b735a8eafcdb..af464bb7fbe7 100644 > --- a/mm/page_owner.c > +++ b/mm/page_owner.c > @@ -10,6 +10,7 @@ > #include <linux/migrate.h> > #include <linux/stackdepot.h> > #include <linux/seq_file.h> > +#include <linux/sched/clock.h> > > #include "internal.h" > > @@ -25,6 +26,8 @@ struct page_owner { > gfp_t gfp_mask; > depot_stack_handle_t handle; > depot_stack_handle_t free_handle; > + u64 ts_nsec; > + pid_t pid; > }; > > static bool page_owner_enabled = false; > @@ -172,6 +175,8 @@ static inline void __set_page_owner_handle(struct page *page, > page_owner->order = order; > page_owner->gfp_mask = gfp_mask; > page_owner->last_migrate_reason = -1; > + page_owner->pid = current->pid; > + page_owner->ts_nsec = local_clock(); > __set_bit(PAGE_EXT_OWNER, &page_ext->flags); > __set_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags); > > @@ -236,6 +241,8 @@ void __copy_page_owner(struct page *oldpage, struct page *newpage) > new_page_owner->last_migrate_reason = > old_page_owner->last_migrate_reason; > new_page_owner->handle = old_page_owner->handle; > + new_page_owner->pid = old_page_owner->pid; > + new_page_owner->ts_nsec = old_page_owner->ts_nsec; > > /* > * We don't clear the bit on the oldpage as it's going to be freed > @@ -349,9 +356,10 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn, > return -ENOMEM; > > ret = snprintf(kbuf, count, > - "Page allocated via order %u, mask %#x(%pGg)\n", > + "Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns\n", > page_owner->order, page_owner->gfp_mask, > - &page_owner->gfp_mask); > + &page_owner->gfp_mask, page_owner->pid, > + page_owner->ts_nsec); > > if (ret >= count) > goto err; > @@ -427,8 +435,9 @@ void __dump_page_owner(struct page *page) > else > pr_alert("page_owner tracks the page as freed\n"); > > - pr_alert("page last allocated via order %u, migratetype %s, gfp_mask %#x(%pGg)\n", > - page_owner->order, migratetype_names[mt], gfp_mask, &gfp_mask); > + pr_alert("page last allocated via order %u, migratetype %s, gfp_mask %#x(%pGg), pid %d, ts %llu\n", > + page_owner->order, migratetype_names[mt], gfp_mask, &gfp_mask, > + page_owner->pid, page_owner->ts_nsec); > > handle = READ_ONCE(page_owner->handle); > if (!handle) { >