Hi, I have an Amazon Linux based Postgresql 11 server here on a t2.medium EC2 instance. It is serving 24 worker processes that read jobs from a queue
(thanks to SELECT ... FOR UPDATE SKIP LOCKED!) and do jobs some of
which are reading and writing business data to the database,
others are only reading, and some don't hit the business data at
all, only the queue. Everything flows quite nicely. Except, I don't understand why I can't max out the CPU or the IO, instead, IO is almost negligible yet the CPU is at 30% hardly hitting 50%. Here I give you a view of top: top - 23:17:09 up 45 days, 2:07, 4 users, load average: 20.32, 18.92, 13.80 Tasks: 338 total, 24 running, 111 sleeping, 0 stopped, 0 zombie %Cpu(s): 28.7 us, 2.5 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 68.7 st KiB Mem : 4040028 total, 1070368 free, 324460 used, 2645200 buff/cache KiB Swap: 0 total, 0 free, 0 used. 2223720 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7678 postgres 20 0 1235072 509744 506356 R 8.7 12.6 1:14.82 postgres: auser integrator 172.31.61.242(25783) BIND 7998 postgres 20 0 1235108 516480 512772 R 8.7 12.8 1:16.20 postgres: auser integrator 172.31.49.159(51708) SELECT 2183 postgres 20 0 1261436 985.8m 982544 R 8.5 25.0 0:44.04 postgres: auser integrator [local] SELECT 7653 postgres 20 0 1235180 501760 497984 R 8.2 12.4 1:13.66 postgres: auser integrator 172.31.54.158(47959) SELECT 7677 postgres 20 0 1235144 506740 502980 S 8.2 12.5 1:13.54 postgres: auser integrator 172.31.61.242(56510) idle in t+ 7680 postgres 20 0 1234684 484356 481100 R 8.2 12.0 1:13.86 postgres: auser integrator 172.31.61.242(49966) SELECT 2631 postgres 20 0 1235120 937964 934528 R 7.9 23.2 10:48.39 postgres: auser integrator 172.31.49.159(33522) idle in t+ 7664 postgres 20 0 1235104 524664 520976 R 7.9 13.0 1:13.95 postgres: auser integrator 172.31.57.147(30036) BIND 7682 postgres 20 0 1234660 496188 492956 R 7.9 12.3 1:15.50 postgres: auser integrator 172.31.61.242(26330) COMMIT 7687 postgres 20 0 1234876 490104 486656 R 7.9 12.1 1:16.77 postgres: auser integrator 172.31.63.71(25285) BIND 7660 postgres 20 0 1235100 502004 498596 R 7.6 12.4 1:18.00 postgres: auser integrator 172.31.57.147(46051) PARSE 7662 postgres 20 0 1235148 503532 500280 R 7.6 12.5 1:14.03 postgres: auser integrator 172.31.57.147(48852) UPDATE 7681 postgres 20 0 1234688 516864 513596 R 7.6 12.8 1:17.77 postgres: auser integrator 172.31.61.242(48192) SELECT 7685 postgres 20 0 1235096 515352 511968 R 7.6 12.8 1:16.17 postgres: auser integrator 172.31.63.71(62540) BIND 7689 postgres 20 0 1235100 509504 505836 S 7.6 12.6 1:14.78 postgres: auser integrator 172.31.63.71(12287) idle in tr+ 7684 postgres 20 0 1235052 500336 496916 R 7.3 12.4 1:14.83 postgres: auser integrator 172.31.63.71(19633) BIND 7654 postgres 20 0 1235224 514512 511040 S 7.0 12.7 1:18.89 postgres: auser integrator 172.31.57.147(43437) idle in t+ 7656 postgres 20 0 1234684 510900 507636 R 7.0 12.6 1:16.19 postgres: auser integrator 172.31.54.158(30397) idle in t+ 7661 postgres 20 0 1234684 514920 511648 S 7.0 12.7 1:16.27 postgres: auser integrator 172.31.57.147(38243) SELECT 7679 postgres 20 0 1235112 512228 508544 R 7.0 12.7 1:14.60 postgres: auser integrator 172.31.61.242(34261) PARSE 7663 postgres 20 0 1234684 517068 513812 R 6.8 12.8 1:17.42 postgres: auser integrator 172.31.57.147(19711) SELECT 7655 postgres 20 0 1235036 505584 502208 R 6.5 12.5 1:16.17 postgres: auser integrator 172.31.54.158(24401) BIND 7658 postgres 20 0 1235072 509432 506108 R 6.5 12.6 1:15.90 postgres: auser integrator 172.31.54.158(48566) idle in t+ 7659 postgres 20 0 1234688 497488 494232 S 6.2 12.3 1:15.70 postgres: auser integrator 172.31.54.158(49841) idle in t+ 7686 postgres 20 0 1234748 508412 505084 R 5.9 12.6 1:14.70 postgres: auser integrator 172.31.63.71(54938) BIND 7688 postgres 20 0 1234708 502416 499204 S 5.9 12.4 1:14.14 postgres: auser integrator 172.31.63.71(49857) SELECT 7657 postgres 20 0 1234684 513740 510476 R 5.6 12.7 1:15.96 postgres: auser integrator 172.31.54.158(47300) SELECT 1304 ec2-user 20 0 171400 4648 3840 R 1.7 0.1 1:25.66 top -c 8492 root 20 0 0 0 0 R 1.7 0.0 0:00.41 [kworker/1:2]and of iostat: avg-cpu: %user %nice %system %iowait %steal %idle 36.41 0.00 3.80 0.00 59.78 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvda 0.00 0.00 0.00 7.69 0.00 92.31 24.00 0.02 3.00 0.00 3.00 3.00 2.31 xvdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdg 0.00 0.00 0.00 11.54 0.00 130.77 22.67 0.05 4.00 0.00 4.00 0.33 0.38 xvdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdj 0.00 0.00 5.77 0.96 46.15 7.69 16.00 0.01 1.14 0.67 4.00 1.14 0.77 xvdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdn 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdp 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdq 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdr 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvds 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdt 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdu 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdv 0.00 0.00 0.00 8.65 0.00 76.92 17.78 0.03 4.00 0.00 4.00 0.44 0.38 xvdw 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdx 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdy 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdz 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdaa 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdab 0.00 0.00 0.00 2.88 0.00 30.77 21.33 0.01 4.00 0.00 4.00 1.33 0.38 previously I had hit 100 %util here, that was when I didn't have the tables so spread out over so many tablespaces. Now I have it spread out like in the olden days where you spread out your tables over many "spindles", and I did this here so I could see which tables or indexes would be bottlenecks. So how can it be that queries take quite long without the process
running at higher CPU%? Or is there something wrong with the total CPU% estimated by both top and iostat? From the top it looks like I have 24 worker processes use 8% each, most of them in R(unning) state, so that would be 192%, which is divided over the 2 CPUs of the t2.medium instance, really 96%. So I am CPU bound after all? regards, |