On 4/22/20 10:46 PM, Roman Gushchin wrote: > To implement per-object slab memory accounting, we need to > convert slab vmstat counters to bytes. Actually, out of > 4 levels of counters: global, per-node, per-memcg and per-lruvec > only two last levels will require byte-sized counters. > It's because global and per-node counters will be counting the > number of slab pages, and per-memcg and per-lruvec will be > counting the amount of memory taken by charged slab objects. > > Converting all vmstat counters to bytes or even all slab > counters to bytes would introduce an additional overhead. > So instead let's store global and per-node counters > in pages, and memcg and lruvec counters in bytes. > > To make the API clean all access helpers (both on the read > and write sides) are dealing with bytes. > > To avoid back-and-forth conversions a new flavor of helpers > is introduced, which always returns values in pages: > node_page_state_pages() and global_node_page_state_pages(). > > Actually new helpers are just reading raw values. Old helpers are > simple wrappers, which perform a conversion if the vmstat items are > in bytes. Because at the moment no one actually need bytes, > there are WARN_ON_ONCE() macroses inside to warn about inappropriate > use cases. > > Thanks to Johannes Weiner for the idea of having the byte-sized API > on top of the page-sized internal storage. > > Signed-off-by: Roman Gushchin <guro@xxxxxx> Reviewed-By: Vlastimil Babka <vbabka@xxxxxxx> But it's somewhat complicated, so it would be great to document it in comments of e.g. include/linux/vmstat.h that what the API returns as unsigned long, can be either bytes or pages depending on vmstat_item_in_bytes(). > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -204,6 +204,11 @@ enum node_stat_item { > NR_VM_NODE_STAT_ITEMS > }; > > +static __always_inline bool vmstat_item_in_bytes(enum node_stat_item item) This should also have a comment explaining if it's talking about API or storage, as it's not immediately obvious. > +{ > + return false; > +} > + > /* > * We do arithmetic on the LRU lists in various places in the code, > * so it is important to keep the active lists LRU_ACTIVE higher in