Martijn van Oosterhout wrote:
On Mon, May 22, 2006 at 12:52:33PM -0400, Greg Stark wrote:
"Rafael Martinez, Guerrero" <r.m.guerrero@xxxxxxxxxxx> writes:
Why do you think 'intr' is a bad thing, from man pages:
" ........ If an NFS file operation has a major timeout and it is
hard mounted, then allow signals to interupt the file operation and
cause it to return EINTR to the calling program. The default is to not
allow file operations to be interrupted ....."
Traditional file systems guaranteed it never happened, so older applications
do not expect to have filesystem operations interrupted. Many do not check for
it or do not handle it properly. I recall a conversation a while back about
Postgres in particular not checking for it.
I've occasionally wondered if this is a SysV vs BSD thing. Under SysV
signal semantics, any signal would cause the current system call to
return EINTR. The list of system calls that could be interrupted is
long, and include just about anything filesystem related. So programs
with any kind of signal handling would handle the broken-NFS case
automatically.
BSD signal semantics (what postgres uses) make all system calls
restart across signals. Thus, a system call can never return EINTR
unless you have non-blocking I/O enabled. These programs would be
confused by unexpected EINTRs.
AFAIK, linux actually abort syscalls when an signal arrives, and it's
just the libc that restarts them automatically. So, actually, doing
do {
ret = syscall(args) ;
} until (ret != EINTR)
in your code should be equivalent to telling the libc to provide BSD semantics, and
just do
ret = syscall(args) ;
Postgres doesn't check EINTR on all filesystem system call and thus
would be susceptable to the above problem.
Even if postgres checked for EINTR, what could it possibly do in that case?
Just retrying wont have any advantage over simply mounting with "nointr" -
it would still just hang when the nfs-server dies.
greetings, Florian Pflug