Re: resources prioritization/scheduler (app vs assets)

Raphaël <raphael.droz@xxxxxxxxx> · Mon, 23 Jan 2017 00:01:35 -0300

Hi again,

first  thank you very much for your answer and all my excuses for
replying after more than a month.

On Sat, Dec 10, 2016 at 02:49:16PM +0000, Dr James Smith wrote:
> Your "model" of traffic is probably wrong...
> 
>  * Have you seen this traffic shape.. if a user requests a page - it
>    will probably be a few milliseconds before the browser requests the
>    first static file, they will usually limit themselves to something
>    like 4, 6 or 8 parallel requests (pipelining) to minimize the
>    DNS/connect/handshake/disconnect phases;

I _believe_ I've encountered this traffic shape.
Indeed individual users run concurrent requests and benefits from
tcp-reuse/keep-alive but:
- they will still end up as 40 requests w.r.t. to MaxRequestWorkers
- if there's a traffic burst (the scenario), other clients do hit
  the main script before the first client even started requesting static
  files (especially if page-load takes dozens or even hundreds of milliseconds)

>  * Even if you have large numbers of simultaneous users the amount of
>    traffic won't be as bursty as you say - as they wont' all hit "go"
>    at the same time;

I took the hypothesis of 200 concurrent clients it's a challenging but
slightly over-estimated from my side.

But reaching 100k people in a 5 minutes period is not far from having
~200/sec. In such conditions, having ~100 concurrent requests on the
main script is not that improbable (esp. if page-load range from 100ms
to 500ms)

>  * The larger overheads are more likely to be up stream in network etc.
thanks for the hint (I haven't yet done the calculation about this=

> Now to reduce load ...
>  * Look at a dedicated caching layer in front of apache. e.g. varnish
>    which can cache the static content; get your headers right so that
>    browsers + upstream caches cache your content;
>  * Look at the apache event mpm - which is much lighter than the other
>    mpms (prefork/threaded)
>  * Do you need 40 assets or can you do optimization on these (e.g.
>    merging css/js files) reducing images, icon fonts, css, spriting
>    etc; I have taken a site requiring 100s of assets and gained by
>    reducing these to 10-15...
>  * If you are worried about performance on such a small box then
>    redesign so that the site isn't heavy!
>  * Look at offloading some resources to 3rd party CDNs (e.g. fonts core
>    js-libraries etc)

Good advises.
Most of them are already in place like caching (within the same Apache
instance), headers, assets aggregation.
40 static files is the higher bound, in order to be conservative (most of them are images)
And mpm_worker is used (I remember having read somewhere that it's
better than mpm_event in case SSL/mod_ssl is used)

The last point is very interesting :
I don't want to offload to 3rd party until I'm technically forced to do so.
I want to do that offloading *within* Apache :)

More exactly, I'd like to do the traffic prioritization rather than
offloadiing the issue behind the 3rd-party carpet.

> Look at your hardware - if you are this worried - 1G is a very small
> box -

Currently the virtual machine scales from 2 GB to 8+ GB according the
traffic expectations.

Whatever are:
* the static files/main-script ratio
* time for a page load 
* the quantity of memory available
... the bottle-neck initially described is just a matter of amount of
concurrency.

And it's going to be reached sooner or later and I'd like to take care of this.
(dealing with the quantity of memory is just a quick and easy way to do
this, but not the best nor the most interesting one)

Plus, I don't have automatic memory allocation, but I could do a "good
enough" prioritization + a "realistic" memory allocation. That would
scale better than using the traffic burst crystal-ball.

> If you want this level of resiliance you probably need to look at load
> balancing over multiple serves - then you can dedicate some to static
> servers and some dynamic servers...

Yes, this is exactly what I want (but within an Apache server, and even
better, within one Apache virtual host)

for reference, I wrote:
> >How to make Apache only accept 1/40 of the traffic to the fcgi php-fpm proxy.
> >Sample heuristic:
> >>If all worker are used (350/350), we "compute" which proportion is
> >>dedicated to index.php. If it's superior to a given configurable
> >>threshold, then free some of the workers dedicated to this resources
> >>in order to accept assets-directed resources.

[substitute "assets" by "static" or "non-fcgi-proxy". The initial
Subject line was misleading about "assets", should have wrote "static")

Reformulation:
Assuming that during a traffic burst, sooner or later, the main script
(php-fcgi) will be the bottleneck because it's slow, how to avoid
the accumulation of main-script request to block requests directed to static stuff ?

Could be seen as the opposite of the "root's ext4 inode 5% reserve":
A percentage of ThreadsPerChild which, under high-load, is *not*
allocated to the main-script ?

Thanks!

> 
> On 10/12/2016 14:22, Raphaël wrote:
> >Hi,
> >
> >I've a question on how to prioritize traffic in order to optimize
> >the service in the case of traffic bursts:
> >
> >
> >Context:
> >* a server with finite resources (let's say 1 GB mem)
> >* a PHP application: initial page load needs 100 MB (index.php)
> >* for each page load (index.php) approx:
> >   * ~ 40 subsequent assets (static files) are needed
> >   * serving assets is, obviously, quicker than serving index.php
> >* I assume, and decide, that PHP-FPM must not use more than 700MB
> >* I want to avoid "broken" pages (missing assets/images/...) as much as possible
> >
> >
> >Thus PHP-FPM is configured to not allow no more than 7 children.
> >The Apache MaxRequestWorkers (worker MPM) is set to be strictly superior than
> >7*40 (lets say 350)
> >
> >
> >Now imagine a traffic burst with 200 distinct clients simultaneously
> >hitting the main page (wow!)
> >They now occupy 57% of the Apache workers, 193 of them waiting for a
> >PHP-FPM child. (<Proxy> "max" default value being ThreadsPerChild)
> >
> >... some hundreds milliseconds later...
> >
> >The 7 first clients having been served, each one now requests 40 more assets.
> >And the situation is then as follows:
> >
> >* 7 hits on index.php were already processed successfully
> >* 7 currently being processed by PHP-FPM (still occupying Apache workers)
> >* 186 queued Apache workers hits /index.php, waiting for PHP-FPM/proxy-fcgi
> >* 7*40 = 280 new hits for assets (subsequent resources needed by the 7 first clients)
> >    * 157 of them immediately get an available Apache worker and can be
> >      served (157+186+7 == 350)
> >    * >>>>>>>  123 assets will NOT get an available worker  <<<<<<< PROBLEM HERE
> >
> >
> >In the "best" case these 123 requests, which should have been served
> >*now*, will end up in the ListenBackLog and wait the 157 first assets to
> >be served first and liberate their workers.
> >
> >The server works virtually *as* if only 350-200 = 150 workers were
> >available (150 being < 280, which is the typical workers implication
> >for 7 pages-load)
> >
> >200 being the (unpredictable/variable) "intensity" of the burst, I would
> >like to know of a better way to handle such a situation.
> >
> >
> >The first ideas that come to mind is service shaping (prioritization/quotas):
> >How to make Apache only accept 1/40 of the traffic to the fcgi php-fpm proxy.
> >Sample heuristic:
> >>If all worker are used (350/350), we "compute" which proportion is
> >>dedicated to index.php. If it's superior to a given configurable
> >>threshold, then free some of the workers dedicated to this resources
> >>in order to accept assets-directed resources.
> >
> >I'm curious about possible solutions.
> >Thank you for reading.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx