Re: does wal archiving block the current client connection?

Simon Riggs <simon@xxxxxxxxxxxxxxx> · Mon, 15 May 2006 22:23:00 +0100

On Mon, 2006-05-15 at 09:28 -0700, Jeff Frost wrote:
> I've run into a problem with a PITR setup at a client.  The problem is that 
> whenever the CIFS NAS device that we're mounting at /mnt/pgbackup has 
> problems

What kind of problems?

> , it seems that the current client connection gets blocked and this 
> eventually builds up to a "sorry, too many clients already" error.  

This sounds like the archiver keeps waking up and trying the command,
but it fails, yet that request is causing a resource leak on the NAS.
Eventually, archiver retrying the command eventually fails. Or am I
misunderstanding your issues?

> I'm 
> wondering if this is expected behavior with the archive command and if I 
> should build in some more smarts to my archive script.  Maybe I should fork 
> and waitpid such that I can use a manual timeout shorter than whatever the 
> CIFS timeout is so that I can return an error in a reasonable amount of time?

The archiver is designed around the thought that *attempting* to archive
is a task that it can do indefinitely without a problem; its up to you
to spot that the link is down.

We can put something in to make the retry period elongate, but you'd
need to put a reasonable case for how that would increase robustness. 

-- 
  Simon Riggs
  EnterpriseDB          http://www.enterprisedb.com