Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Wed, 18 Sep 2019 04:38:59 -0700

On Wed, Sep 18, 2019 at 11:21:04AM +0800, Lin Feng wrote:
> > Adding a new tunable is not the right solution.  The right way is
> > to make Linux auto-tune itself to avoid the problem.  For example,
> > bdi_writeback contains an estimated write bandwidth (calculated by the
> > memory management layer).  Given that, we should be able to make an
> > estimate for how long to wait for the queues to drain.
> > 
> 
> Yes, I had ever considered that, auto-tuning is definitely the senior AI way.
> While considering all kinds of production environments hybird storage solution
> is also common today, servers' dirty pages' bdi drivers can span from high end
> ssds to low end sata disk, so we have to think of a *formula(AI core)* by using
> the factors of dirty pages' amount and bdis' write bandwidth, and this AI-core
> will depend on if the estimated write bandwidth is sane and moreover the to be
> written back dirty pages is sequential or random if the bdi is rotational disk,
> it's likey to give a not-sane number and hurt guys who dont't want that, while
> if only consider ssd is relatively simple.
> 
> So IMHO it's not sane to brute force add a guessing logic into memory writeback
> codes and pray on inventing a formula that caters everyone's need.
> Add a sysctl entry may be a right choice that give people who need it and
> doesn't hurt people who don't want it.

You're making this sound far harder than it is.  All the writeback code
needs to know is "How long should I sleep for in order for the queues
to drain a substantial amount".  Since you know the bandwidth and how
many pages you've queued up, it's a simple calculation.