On Wed, Dec 18, 2019 at 3:53 AM James(王旭) <wangxu@xxxxxxxxx> wrote: > > Hello, >> >> I encountered into this kernel message, and I cannot login into the Linux system anymore: >> >> >> >>> Dec 17 23:01:50 hq-pg kernel: sh (6563): drop_caches: 1 >>> >>> Dec 17 23:02:30 hq-pg kernel: INFO: task sync:6573 blocked for more than 120 seconds. >>> >>> Dec 17 23:02:30 hq-pg kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>> >>> Dec 17 23:02:30 hq-pg kernel: sync D ffff965ebabd1040 0 6573 6572 0x00000080 >>> >>> Dec 17 23:02:30 hq-pg kernel: Call Trace: >>> >>> Dec 17 23:02:30 hq-pg kernel: [<ffffffffa48760a0>] ? generic_write_sync+0x70/0x70 >> >> >> After some google I guess it's the problem that IO speed is low, while the insert requests are coming too much quickly.So PG put these into cache first then kernel called sync. >> >> I know I can queue the requests, so that POSTGRES will not accept these requests which will result in an increase in system cache. >> >> But is there any way I can tell POSTGRES, that you can only handle 20000 records per second, or 4M per second, please don't accept inserts more than that speed. >> >> For me, POSTGRES just waiting is much better than current behavior. >> >> >> Any help will be much appreciated. This is more a problem with the o/s than with postgres itself. synchronous_commit is one influential parameter that can possibly help mitigate the issue with some safety tradeoffs (read the docs). For linux, one possible place to look is tuning dirty_background_ratio and related parameters. The idea is you want the o/s to be more aggressive about syncing to reduce the impact of i/o storm; basically you are trading off some burst performance for consistency of performance. Another place to look is checkpoint behavior. Do some searches, there is tons of information about this on the net. merlin