Re: [PATCH v1 0/4] mm/ksm: Add ksm advisor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Hildenbrand <david@xxxxxxxxxx> writes:

> On 06.10.23 18:17, Stefan Roesch wrote:
>> David Hildenbrand <david@xxxxxxxxxx> writes:
>>
>>> On 04.10.23 21:02, Stefan Roesch wrote:
>>>> What is the KSM advisor?
>>>> =========================
>>>> The ksm advisor automatically manages the pages_to_scan setting to
>>>> achieve a target scan time. The target scan time defines how many seconds
>>>> it should take to scan all the candidate KSM pages. In other words the
>>>> pages_to_scan rate is changed by the advisor to achieve the target scan
>>>> time.
>>>> Why do we need a KSM advisor?
>>>> ==============================
>>>> The number of candidate pages for KSM is dynamic. It can often be observed
>>>> that during the startup of an application more candidate pages need to be
>>>> processed. Without an advisor the pages_to_scan parameter needs to be
>>>> sized for the maximum number of candidate pages. With the scan time
>>>> advisor the pages_to_scan parameter based can be changed based on demand.
>>>> Algorithm
>>>> ==========
>>>> The algorithm calculates the change value based on the target scan time
>>>> and the previous scan time. To avoid pertubations an exponentially
>>>> weighted moving average is applied.
>>>> The algorithm has a max and min
>>>> value to:
>>>> - guarantee responsiveness to changes
>>>> - to avoid to spend too much CPU
>>>> Parameters to influence the KSM scan advisor
>>>> =============================================
>>>> The respective parameters are:
>>>> - ksm_advisor_mode
>>>>     0: None (default), 1: scan time advisor
>>>> - ksm_advisor_target_scan_time
>>>>     how many seconds a scan should of all candidate pages take
>>>> - ksm_advisor_min_pages
>>>>     minimum value for pages_to_scan per batch
>>>> - ksm_advisor_max_pages
>>>>     maximum value for pages_to_scan per batch
>>>> The parameters are exposed as knobs in /sys/kernel/mm/ksm.
>>>> By default the scan time advisor is disabled.
>>>
>>> What would be the main reason to not have this enabled as default?
>>>
>> There might be already exisiting users which directly set pages_to_scan
>> and tuned the KSM settings accordingly, as the default setting of 100 for
>> pages_to_scan is too low for typical workloads.
>
> Good point.
>
>>
>>> IIUC, it is kind-of an auto-tuning of pages_to_scan. Would "auto-tuning"
>>> describe it better than "advisor" ?
>>>
>>> [...]
>>>
>> I'm fine with auto-tune. I was also thinking about that name, but I
>> chose advisor, its a bit less strong and it needs input from the user.
>>
>
> I'm not a native speaker, but "adviser" to me implies that no action is taken,
> only advises are given :) But again, no native speaker.
>
>>>> How is defining a target scan time better?
>>>> ===========================================
>>>> For an administrator it is more logical to set a target scan time.. The
>>>> administrator can determine how many pages are scanned on each scan.
>>>> Therefore setting a target scan time makes more sense.
>>>> In addition the administrator might have a good idea about the
>>>> memory sizing of its respective workloads.
>>>
>>> Is there any way you could imagine where we could have this just do something
>>> reasonable without any user input? IOW, true auto-tuning?
>>>
>> True auto-tuning might be difficult as users might want to be able to
>> choose how aggressive KSM is. Some might want it to be as aggressive as
>> possible to get the maximum de-duplication rate. Others might want a
>> more balanced approach that takes CPU-consumption into consideration.
>> I guess it depends if you are memory-bound, cpu-bound or both.
>
> Agreed, more below.
>
>>
>>> I read above:
>>>> - guarantee responsiveness to changes
>>>> - to avoid to spend too much CPU
>>>
>>> whereby both things are accountable/measurable to use that as the input for
>>> auto-tuning?
>>>
>> I'm not sure a true auto-tuning can be achieved. I think we need
>> some input from the user
>> - How much resources to consume
>> - How fast memory changes or how stable memory is
>>    (this we might be able to detect)
>
> Setting the pages_to_scan is a bit mystical. Setting upper/lower pages_to_scan
> bounds is similarly mystical, and highly workload dependent.
>
> So I agree that a better abstraction to automatically tune the scanning is
> reasonable. I wonder if we can let the user give better inputs that are less
> workload dependent.
>
> For example, do we need min/max values for pages_to_scan, or can we replace it
> by something better to the auto-tuning algorithm?
>
> IMHO "target scan time" goes into the right direction, but it can still be
> fairly workload dependent. Maybe a "max CPU consumption" or sth. like that would
> similarly help to limit CPU waste, and it could be fairly workload dependent.

I can look into replacing min/max values for pages_to_scan with min/max
cpu utilization. This might be easier for users to decide on. However I
still think that we need a target value like scan time to optimize for.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux