On Sat, Sep 22, 2018 at 04:58:18PM +0200, Christoph Moench-Tegeder wrote: > ## Doron Behar (doron.behar@xxxxxxxxx): > > > My server fails to start PostgreSQL only on boot, if I restart it > > manually afterwards it doesn't have any problem starting. Here is the > > log extracted from the journal: > > > > ``` > > 2018-09-21 20:46:40.028 CEST [306] LOG: listening on IPv4 address "127.0.0.1", port 5432 > > 2018-09-21 20:46:40.036 CEST [306] LOG: listening on Unix socket "/run/postgresql/.s.PGSQL.5432" > > 2018-09-21 20:46:40.233 CEST [337] LOG: database system was shut down at 2018-09-21 20:46:21 CEST > > 2018-09-21 20:48:10.441 CEST [352] WARNING: worker took too long to start; canceled > > 2018-09-21 20:49:10.469 CEST [352] WARNING: worker took too long to start; canceled > > This would indicate that your machine is overloaded during start - > perhaps there's just too much being started at the same time? > ObRant: that's what happens if people take "system startup duration" > as a benchmark and optimize for that - sure, running one clumsy shell > script after another isn't effective usage of today's systems, > but starting eight dozens programs all at once may have other > side effects. Really, with the hardware taking small ages to find > it's own arse before even loading the boot loader, those few seconds > weren't worth optimizing - and if people reboot their computers so > often that startup time takes a measurable toll on their productive > day, perhaps they should rather spend their time thinking about their > usage pattern than "optimizing" the startup process. > > So, now that I've got that off my chest... your machine propably tries to > do too much at the same time when booting: the worker processes take > longer than 90 seconds to start. Slow CPU or storage maybe? > > > 2018-09-21 20:49:10.478 CEST [306] LOG: database system is ready to accept connections > > 2018-09-21 20:49:10.486 CEST [306] LOG: received fast shutdown request > > And in the mean time, systemd has lost it's patience, declares the > start as failed and terminates the process group. (The default systemd > timeout is 90 seconds, at least in some releases of systemd, so > this fits quite nicely). > > You could try to work around this by increasing TimeoutStartSec > in postgresql's systemd unit (or even globally), which perhaps > only hides the problem until the next service suddenly doesn't > start anymore. > You could move postgresql to the end of the boot order by > adding "After=..." to the Unit section of the systemd service > file, the value behind "After=" being all the other services in > the same target, which should reduce parallelism and improve > PostgreSQL's startup behaviour. > A more advanced variant of that would be to create a new > systemd target, make that start "After" multiuser.target > or even graphical.target (depending on your setup), make sure > it "Requires" the current default systemd target and make > postgresql the only additional service in that target. > (This would be the cleanest solution, but you should get some > grasp of systemd and how your specific distribution uses it > before meddling with the default targets; I don't know every > distribution/version variant of systemd integration, so I > can't give that specific instructions here). > Or you figure out what the heck your machine is running > during startup any why it is that slow, and try to fix that. > > Regards, > Christoph Thanks for your very detailed answer, that helped me a lot. I've increased `TimeoutSec=` to infinity in the systemd service since it was set initially to 120 seconds which apparently wasn't enough for my poor VPS with 2G RAM and 1 CPU core. That worked great, I still feel like I have slow startups but at least PostgreSQL doesn't totally fail to start on boot. I'll try to debug the slow startups on my own, thanks again for everything! Doron.