HEAD~1 split struct _ddebugs into heads & bodies, linked accross 2 ELF sections. Lets now store copies of the bodies into a zs_pool, and relink head to the new body. This should allow recycling the section soon. The strategy is to let a compression algo handle the repetition, and map individual records in when needed. If that works, we only need one record at a time, so the 3:1 zram compression applys to 99.9% of callsites, ie to 98120 bytes below: dyndbg: 259 modules, 2453 entries and 10360 bytes in ddebug tables, 98120 bytes in __dyndbg section, 98120 bytes in __dyndbg_callsites section dyndbg: 2453 entries. repeated entries: 2193 module 2058 file 981 func Since dynamic_debug is nice to have but rarely used, paying more for active logging to save in-kernel memory might be a worthwhile tradeoff. This is how: ddebug_zpool_init(), at late_initcall time, creates the zpool. Existing ddebug_add_module() now also calls ddebug_zpool_add(), which copies the module's _ddebug records into the zpool, once it is ready. For builtins, added early, the pool is not ready yet. So ddebug_zpool_init() also calls ddebug_zpool_add() for all those builtin modules already _add()ed. ddebug_zpool_add() does, foreach _ddebug: - zs_mallocs and saves to .zhandle, - zs_maps and saves .site to it - zs_unmaps it, triggering write to zram - .site=0, detaching __dyndbg_callsites[item] which later triggers zs_map So we leave late-init with a full zram copy, and __dyndbg_callsites ready to reclaim. run-time access for 3 existing users is entirely via helpers; ddebug_callsite_get(), ddebug_callsite_put(). These 3 uses are all with ddebug_lock held. _get() returns .site 1st, or maps zrec to it and returns it. _put() always unmaps, minimizing online ram (see 1 below) ddebug_change() also gets s/continue/goto skipsite/g to cleanly unmap the record at the bottom of the loop. Next steps: 1. dont always unmap in _put(), leave enabled pr_debugs mapped. This would make the set of zs_mapped recs work like a cache, reducing zs_mappings to a trickle, and only for <>control. 2. adapt struct _ddebug to unionize .site and .zhandle We currently use !!.site and !!.zhandle tests to keep state, we have bits available in flags to replace these tests, making union possible. A crazy alt is to create a "restricted" zpool, which limits pool size to 2**N handles, and returned handles to an (N+ck)-bit number, which would fit beside our 8-bit flags, inside an 8-byte pointer. This would be cool, but isnt necessary, since union looks practical. 3. move key from struct _ddebug into _ddebug_callsite key doesnt exactly fit semantically (and its not RO), but its use is super rare, and is never needed without also needing all the other callsite data. Getting it out of head saves on-line ram. But it broke in HEAD~1 when I tried it, with asm goto errors. 4. linker question DECLARE_DYNAMIC_DEBUG_METADATA macro fills 2 sections simultaneously. If we can rely on the linker to keep those records in identical order, we can 'index' both _dyndbg & _callsites (& keys) with [N-opaque] and drop the .site pointer. 5. improve compression ? Im seeing 3:1 pages_per_zspage: # cut -c1-12,40-77,85-90 $SKD/zsmalloc/dyndbg_callsites/classes | head -n3 class sizebj_allocated obj_used pages_used pagzspage 0 32 0 0 0 1 1 48 2816 2605 33 3 Since the __dyndbg_callsite section is well sorted, a simple run-length encoding on 3 RO members (modname, file, function) could reclaim most of the (90%, 84%, 45%) repeated values. That said, a standard compressor should be able to compete, and there are several to try. Lastly, this whole zram block is RO after late-init (maybe, if we split into 2 zpools, for builtins and loadable modules). So compression can be biased towards costly but excellent compression (esp for RO zpool) and good small block decompression for random-ish access. 6. we could add a ram_reclaim(), to walk the dt-list, check each _ddebug, and unmap the mapped ones. This presumes 1. is done, and there are sometimes memory pressures to respond to. Potential benefits: - convert in-kernel mem to zram/etc - zram compression - eventually swap it out entirely - map in the enabled callsites only Signed-off-by: Jim Cromie <jim.cromie@xxxxxxxxx> --- include/linux/dynamic_debug.h | 1 + lib/dynamic_debug.c | 140 +++++++++++++++++++++++++++++++--- 2 files changed, 129 insertions(+), 12 deletions(-) diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h index 1bf52952df69..693911920a32 100644 --- a/include/linux/dynamic_debug.h +++ b/include/linux/dynamic_debug.h @@ -29,6 +29,7 @@ struct _ddebug_callsite { struct _ddebug { struct _ddebug_callsite *site; + long unsigned int zhandle; /* * The flags field controls the behaviour at the callsite. diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c index 9161f345e3c9..049299027fb3 100644 --- a/lib/dynamic_debug.c +++ b/lib/dynamic_debug.c @@ -36,6 +36,7 @@ #include <linux/sched.h> #include <linux/device.h> #include <linux/netdevice.h> +#include <linux/zsmalloc.h> #include <rdma/ib_verbs.h> @@ -72,6 +73,8 @@ static LIST_HEAD(ddebug_tables); static int verbose; module_param(verbose, int, 0644); +static struct zs_pool *dd_callsite_zpool; + /* Return the path relative to source root */ static inline const char *trim_prefix(const char *path) { @@ -118,6 +121,8 @@ do { \ #define vpr_info(fmt, ...) vnpr_info(1, fmt, ##__VA_ARGS__) #define v2pr_info(fmt, ...) vnpr_info(2, fmt, ##__VA_ARGS__) +#define v3pr_info(fmt, ...) vnpr_info(3, fmt, ##__VA_ARGS__) +#define v4pr_info(fmt, ...) vnpr_info(4, fmt, ##__VA_ARGS__) static void vpr_info_dq(const struct ddebug_query *query, const char *msg) { @@ -139,6 +144,41 @@ static void vpr_info_dq(const struct ddebug_query *query, const char *msg) query->first_lineno, query->last_lineno); } +/* + * Lifecycle of struct _ddebug's .site and .zhandle: + * + * At early-init, .site points into __dyndbg_callsite[], .zhandle=0 + * + * At late-init, zpool_add(callsite)s copies into zram, saves ref in + * .zhandle, and forgets .site, allowing __dyndbg_callsite[] to be + * reclaimed. After this, we restore .site from .zhandle when needed. + * We default to minimal mapping, load on demand. + * + * _get() returns .site preferentially, and updates it when needed. + * this works for early-prints and minimal mapped records strategy. + */ +static struct _ddebug_callsite* ddebug_callsite_get(struct _ddebug *dp) +{ + if (dp->site) + return dp->site; + + dp->site = (struct _ddebug_callsite*) + zs_map_object(dd_callsite_zpool, dp->zhandle, ZS_MM_RO); + + return dp->site; +} + +static void ddebug_callsite_put(struct _ddebug *dp) +{ + if (!dp->site || !dp->zhandle) + /* no site to unmap, or no means to restore */ + return; + + /* always unmap for now. if !pr-debug was too hard */ + zs_unmap_object(dd_callsite_zpool, dp->zhandle); + dp->site = NULL; +} + /* * Search the tables for _ddebug's which match the given `query' and * apply the `flags' and `mask' to them. Returns number of matching @@ -165,7 +205,8 @@ static int ddebug_change(const struct ddebug_query *query, for (i = 0; i < dt->num_ddebugs; i++) { struct _ddebug *dp = &dt->ddebugs[i]; - struct _ddebug_callsite *dc = dp->site; + struct _ddebug_callsite *dc; + dc = ddebug_callsite_get(dp); /* match against the source filename */ if (query->filename && @@ -174,12 +215,12 @@ static int ddebug_change(const struct ddebug_query *query, kbasename(dc->filename)) && !match_wildcard(query->filename, trim_prefix(dc->filename))) - continue; + goto skipsite; /* match against the function */ if (query->function && !match_wildcard(query->function, dc->function)) - continue; + goto skipsite; /* match against the format */ if (query->format) { @@ -188,24 +229,24 @@ static int ddebug_change(const struct ddebug_query *query, /* anchored search. match must be at beginning */ p = strstr(dc->format, query->format+1); if (p != dc->format) - continue; + goto skipsite; } else if (!strstr(dc->format, query->format)) - continue; + goto skipsite; } /* match against the line number range */ if (query->first_lineno && dc->lineno < query->first_lineno) - continue; + goto skipsite; if (query->last_lineno && dc->lineno > query->last_lineno) - continue; + goto skipsite; nfound++; newflags = (dp->flags & modifiers->mask) | modifiers->flags; if (newflags == dp->flags) - continue; + goto skipsite; #ifdef CONFIG_JUMP_LABEL if (dp->flags & _DPRINTK_FLAGS_PRINT) { if (!(modifiers->flags & _DPRINTK_FLAGS_PRINT)) @@ -218,6 +259,9 @@ static int ddebug_change(const struct ddebug_query *query, trim_prefix(dc->filename), dc->lineno, dt->mod_name, dc->function, ddebug_describe_flags(dp->flags, &fbuf)); + + skipsite: + ddebug_callsite_put(dp); } } mutex_unlock(&ddebug_lock); @@ -569,14 +613,16 @@ static int remaining(int wrote) return 0; } -static char *dynamic_emit_prefix(const struct _ddebug *dp, char *buf) +static char *dynamic_emit_prefix(struct _ddebug *dp, char *buf) { int pos_after_tid; int pos = 0; - const struct _ddebug_callsite *desc = dp->site; + const struct _ddebug_callsite *desc; *buf = '\0'; + desc = ddebug_callsite_get(dp); + if (dp->flags & _DPRINTK_FLAGS_INCL_TID) { if (in_interrupt()) pos += snprintf(buf + pos, remaining(pos), "<intr> "); @@ -599,6 +645,8 @@ static char *dynamic_emit_prefix(const struct _ddebug *dp, char *buf) if (pos >= PREFIX_SIZE) buf[PREFIX_SIZE - 1] = '\0'; + ddebug_callsite_put(dp); + return buf; } @@ -863,7 +911,7 @@ static int ddebug_proc_show(struct seq_file *m, void *p) { struct ddebug_iter *iter = m->private; struct _ddebug *dp = p; - struct _ddebug_callsite *dc = dp->site; + struct _ddebug_callsite *dc; struct flagsbuf flags; if (p == SEQ_START_TOKEN) { @@ -872,6 +920,8 @@ static int ddebug_proc_show(struct seq_file *m, void *p) return 0; } + dc = ddebug_callsite_get(dp); + seq_printf(m, "%s:%u [%s]%s =%s \"", trim_prefix(dc->filename), dc->lineno, iter->table->mod_name, dc->function, @@ -879,6 +929,8 @@ static int ddebug_proc_show(struct seq_file *m, void *p) seq_escape(m, dc->format, "\t\r\n\""); seq_puts(m, "\"\n"); + ddebug_callsite_put(dp); + return 0; } @@ -922,6 +974,39 @@ static const struct proc_ops proc_fops = { .proc_write = ddebug_proc_write }; +/* + * copy struct _ddebug records into the zpool, and remember zhandle in + * the struct. This is close to what we'll want to unionize the + * handle and the struct + */ +static void ddebug_zpool_add(struct _ddebug *dp) +{ + unsigned long handle; + struct _ddebug_callsite *cursor; + + handle = zs_malloc(dd_callsite_zpool, sizeof(struct _ddebug_callsite), + GFP_KERNEL); + if (!handle) { + pr_err("pool malloc failed on %s\n", dp->site->function); + return; + } + BUG_ON(handle % 2); /* zs-malloc handles are even ! */ + dp->zhandle = handle; + + cursor = (struct _ddebug_callsite *) + zs_map_object(dd_callsite_zpool, handle, ZS_MM_WO); + + if (!cursor) { + pr_err("zs-map failed\n"); + return; + } + memcpy(cursor, dp->site, sizeof(struct _ddebug_callsite)); + + /* forget __dyndbg_callsites[] so we can reclaim it */ + dp->site = NULL; + zs_unmap_object(dd_callsite_zpool, handle); +} + /* * Allocate a new ddebug_table for the given module * and add it to the global list. @@ -947,10 +1032,15 @@ int ddebug_add_module(struct _ddebug *tab, unsigned int n, dt->ddebugs = tab; mutex_lock(&ddebug_lock); + if (dd_callsite_zpool) + /* zpool is ready for filling late */ + for (; n; n--, tab++) + ddebug_zpool_add(tab); + list_add(&dt->link, &ddebug_tables); mutex_unlock(&ddebug_lock); - v2pr_info("%3u debug prints in module %s\n", n, dt->mod_name); + v2pr_info("%3u debug prints in module %s\n", dt->num_ddebugs, dt->mod_name); return 0; } @@ -1060,6 +1150,31 @@ static int __init dynamic_debug_init_control(void) return 0; } +/* + * initialize the zpool, and fill it with copies of struct _ddebug + * records in the __verbose/__dyndbg section. + */ +static void __init ddebug_zpool_init(void) +{ + struct _ddebug *iter; + + /* tbd- no corresponding destroy */ + dd_callsite_zpool = zs_create_pool("dyndbg_callsites"); + if (!dd_callsite_zpool) { + pr_err("create pool failed\n"); + return; + } + + /* add-module normally does this, but not in time for builtins */ + for (iter = __start___dyndbg; iter < __stop___dyndbg; iter++) + ddebug_zpool_add(iter); + + v2pr_info("total pages: %lu compaction: %lu\n", + zs_get_total_pages(dd_callsite_zpool), + zs_compact(dd_callsite_zpool)); +} +late_initcall(ddebug_zpool_init); + static int __init dynamic_debug_init(void) { struct _ddebug *iter, *iter_start, *prev = 0; @@ -1149,3 +1264,4 @@ early_initcall(dynamic_debug_init); /* Debugfs setup must be done later */ fs_initcall(dynamic_debug_init_control); + -- 2.26.2 _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies