On 8 January 2018 at 21:39, Willy Tarreau <w@xxxxxx> wrote: > On Mon, Jan 08, 2018 at 09:26:10PM +0100, Yves-Alexis Perez wrote: >> On Mon, 2018-01-08 at 19:26 +0100, Willy Tarreau wrote: >>> You're totally right, I discovered during my later developments that indeed >>> PCID is not exposed there. So we take the hit of a full TLB flush twice per >>> syscall. >> >> So I really think it might make sense to redo the tests with PCID, because >> the assumptions you're basing your patch series on might actually not hold. > > I'll have to do it on the bare-metal server soon anyway. > > Cheers, Willy Today, I performed a stress test against various kernels using HAPEE 1.7-r1 (Enterprise version of HAProxy) on CentOS 7.4 and here is the executive summary: RedHat Kernel with patches increases latency by 275%, drops capacity by 34% and completely saturates CPU resources. Kernels 4.9.75 and 4.14.12 doesn't increase latency and doesn't drop the capacity compared to 4.9.56. Kernel 4.14.12 doesn't bring significant improvements over 4.9.75. Moreover, kernel packages came with some microcode updates which did something on the CPUs during the boot process. I don't have a lot of details about those microcode updates. Here some info about the test environment: I used one server as a test haproxy node and configured a pool with a single server, where another haproxy server was acting as a server. I used another machine as a generator to send 80K/sec HTTP requests. I didn't have the time to set up a distributed stress test and as a result, I could only generate 80K/sec requests - A single HAProxy node can serve much more requests. The server, which was behind haproxy, served all requests from memory and without any external dependencies(Disk/Network IO). The request and the response fit in one TCP packet. All servers were bare-metal, had Intel E5-2620v3 @ 2.40GHz (with PCID support) and connected to the same switch with 10GbE interfaces. Due to network restrictions, I couldn't use the 2nd interface and a result the traffic traveled twice from one interface. As a generator tool, I used wrk2, and I set 80K/second as maximum request rate. I have also compared two production haproxy servers running RedHat kernels with and without patches and the server with the patches spent 50% more at CPU at system and softirq levels. Those productions server have been handling customer traffic since Friday, so I have 100% confidence in the CPU increase. I will do the same for two production servers with 4.9.56 and 4.9.75 kernels. Moreover, I have observed similar drop of performance with RedHat kernel on different workloads, such as MySQL, cassandra and graphite. On the later, I observed a big performance win with 4.14.12. Several people mentioned to me that kernel 4.14 should provide much better performance due to the Longer-lived TLB Entries with PCID feature. So far, I haven't seen in my haproxy test. This does not mean that it doesn't have better performance; absence of evidence is not evidence of absence :-) All above are specific to my workload and setup; other workloads may see less or more impact on the performance. Last but not least, I would like to take this opportunity to say a huge THANK YOU to all developers/engineers involved in making Linux more secure and safe to use. Cheers, Pavlos
Attachment:
signature.asc
Description: OpenPGP digital signature