Re: [PATCH v4 2/3] mm: add new KSM process and sysfs knobs

Stefan Roesch <shr@xxxxxxxxxxxx> · Wed, 05 Apr 2023 14:20:27 -0700

David Hildenbrand <david@xxxxxxxxxx> writes:

> On 10.03.23 19:28, Stefan Roesch wrote:
>> This adds the general_profit KSM sysfs knob and the process profit metric
>> and process merge type knobs to ksm_stat.
>> 1) split off pages_volatile function
>>     This splits off the pages_volatile function.  The next patch will
>>     use this function.
>> 2) expose general_profit metric
>>     The documentation mentions a general profit metric, however this
>>     metric is not calculated.  In addition the formula depends on the size
>>     of internal structures, which makes it more difficult for an
>>     administrator to make the calculation.  Adding the metric for a better
>>     user experience.
>> 3) document general_profit sysfs knob
>> 4) calculate ksm process profit metric
>>     The ksm documentation mentions the process profit metric and how to
>>     calculate it.  This adds the calculation of the metric.
>> 5) add ksm_merge_type() function
>>     This adds the ksm_merge_type function.  The function returns the
>>     merge type for the process.  For madvise it returns "madvise", for
>>     prctl it returns "process" and otherwise it returns "none".
>> 6) mm: expose ksm process profit metric and merge type in ksm_stat
>>     This exposes the ksm process profit metric in /proc/<pid>/ksm_stat.
>>     The name of the value is ksm_merge_type.  The documentation mentions
>>     the formula for the ksm process profit metric, however it does not
>>     calculate it.  In addition the formula depends on the size of internal
>>     structures.  So it makes sense to expose it.
>> 7) document new procfs ksm knobs
>>
>
> Often, when you have to start making a list of things that a patch does, it
> might make sense to split some of the items into separate patches such that you
> can avoid lists and just explain in list-free text how the pieces in the patch
> fit together.
>
> I'd suggest splitting this patch into logical pieces. For example, separating
> the general profit calculation/exposure from the per-mm profit and the per-mm
> ksm type indication.
>

Originally these were individual patches. If I recall correctly Johannes
Weiner wanted them as one patch. I can certainly split them again.

>> Link: https://lkml.kernel.org/r/20230224044000.3084046-3-shr@xxxxxxxxxxxx
>> Signed-off-by: Stefan Roesch <shr@xxxxxxxxxxxx>
>> Reviewed-by: Bagas Sanjaya <bagasdotme@xxxxxxxxx>
>> Cc: David Hildenbrand <david@xxxxxxxxxx>
>> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
>> Cc: Michal Hocko <mhocko@xxxxxxxx>
>> Cc: Rik van Riel <riel@xxxxxxxxxxx>
>> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> ---
>
>
> [...]
>
>>   KSM_ATTR_RO(pages_volatile);
>>   @@ -3280,6 +3305,21 @@ static ssize_t zero_pages_sharing_show(struct kobject
>> *kobj,
>>   }
>>   KSM_ATTR_RO(zero_pages_sharing);
>>   +static ssize_t general_profit_show(struct kobject *kobj,
>> +				   struct kobj_attribute *attr, char *buf)
>> +{
>> +	long general_profit;
>> +	long all_rmap_items;
>> +
>> +	all_rmap_items = ksm_max_page_sharing + ksm_pages_shared +
>> +				ksm_pages_unshared + pages_volatile();
>
> Are you sure you want to count a config knob (ksm_max_page_sharing) into that
> formula? I yet have to digest what this calculation implies, but it does feel
> odd.
>

This was a mistake. I wanted ksm_pages_sharing instead of
ksm_max_page_sharing.

>
> Further, maybe just avoid pages_volatile(). Expanding the formula (excluding
> ksm_max_page_sharing for now):
>
>
> all_rmap = ksm_pages_shared + ksm_pages_unshared + pages_volatile();
>
> -> expand pages_volatile() (ignoring the < 0 case)
>
> all_rmap = ksm_pages_shared + ksm_pages_unshared + ksm_rmap_items -
> ksm_pages_shared - ksm_pages_sharing - ksm_pages_unshared;
>
> -> simplify
>
> all_rmap = ksm_rmap_items + ksm_pages_sharing;
>
I'll simplify it.

> Or is the < 0 case relevant here?
>

A negative profit is ok.

>> +	general_profit = ksm_pages_sharing * PAGE_SIZE -
>> +				all_rmap_items * sizeof(struct ksm_rmap_item);
>> +
>> +	return sysfs_emit(buf, "%ld\n", general_profit);
>> +}
>> +KSM_ATTR_RO(general_profit);
>> +
>>   static ssize_t stable_node_dups_show(struct kobject *kobj,
>>   				     struct kobj_attribute *attr, char *buf)
>>   {
>> @@ -3345,6 +3385,7 @@ static struct attribute *ksm_attrs[] = {
>>   	&stable_node_dups_attr.attr,
>>   	&stable_node_chains_prune_millisecs_attr.attr,
>>   	&use_zero_pages_attr.attr,
>> +	&general_profit_attr.attr,
>>   	NULL,
>>   };
>>
>
> The calculations (profit) don't include when KSM places the shared zeropage I
> guess. Accounting that per MM (and eventually globally) is in the works. [1]
>
>
> [1]
> https://lore.kernel.org/lkml/20230328153852.26c2577e4bd921c371c47a7e@xxxxxxxxxxxxxxxxxxxx/t/