PSA: New Kernels and intel_idle cpuidle Driver!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey guys,

I have a pretty nasty heads-up. If you have hardware using an Intel XEON and a newer Linux kernel, you may be experiencing very high CPU latency. You can check yourself:

cat /sys/devices/system/cpu/cpuidle/current_driver

If it says intel_idle, the Linux kernel will *aggressively* put your CPU to sleep. We definitely noticed this, and it's pretty darn painful. But it's *more* painful in your asynchronous, standby, or otherwise less busy nodes. Why?

As you can imagine, the secondary nodes don't get much activity, so spend most of their time sleeping. Now the CPU has a lot more sleep time, and wake latency while trying to copy data or process new WAL traffic.

To fix this, you must actually hint to, or outright disable, the driver by picking your own C-state, probably the one you wanted in the BIOS in the first place. We did this by adding the following options to GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub, but your distro may differ.

intel_idle.max_cstate=0 processor.max_cstate=0 idle=mwait

Then reboot. Here are the benefits we got:

* %util difference between backing device and DRBD went down by 30-40% on our replicating nodes.
* TCP RTT is almost 10x faster.

I'm totally not kidding about that last one. Due to the time necessary to wake a CPU to handle the network traffic, latency was massively increased using the intel_idle driver. Our RTT average was 0.375ms on a 10G link before. Now it's 0.04ms after using the settings above.

Consider this a PSA. DRBD is unfairly being blamed for bad performance with the intel_idle cpuidle driver in newer kernels! If you have DRBD on a newer Intel system, I highly recommend you make the above changes, especially since it directly affects your replication speed.

It took us days to figure this out, so I figured I'd share.

Thanks, everyone!


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@xxxxxxxxxxxxxxxx

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux