On Jan 25, 2010, at 2:38 PM, Whoop Whouzer wrote:
Running "strace nautilus" gives me allot of output. When I run it
while the server is down it completes the trace without a hiccup, it
returns and than nautilus is launched and hangs.
There are differences between the traces (with server up and server
down). I can't really see where the problem lies in there.
I would expect that the command-line nautilus forks when it starts
up. If it has some option you can specify to prevent that, it might
allow a deeper look. You would need to tell strace to look at the
children, too.
On Mon, Jan 25, 2010 at 8:08 PM, Chuck Lever
<chuck.lever@xxxxxxxxxx> wrote:
On Jan 25, 2010, at 2:02 PM, Whoop Whouzer wrote:
Ok, I did that, after shutting down the server and enabling debug
trace I tried to open the home folder of the current account
(totally
unrelated to the nfsshare), it wouldn't open at all, I got no
nautilus
at all. During the time my cursor was in busy mode I got the
following
messages in kern.log (for ubuntu 10.04 client):
Jan 25 19:30:13 whoop-desktop kernel: [ 160.719262] NFS call
fsstat
Jan 25 19:30:37 whoop-desktop kernel: [ 184.458611] NFS:
permission(0:16/74386), mask=0x10, res=0
Jan 25 19:30:37 whoop-desktop kernel: [ 184.458647] NFS call
access
Jan 25 19:30:43 whoop-desktop kernel: [ 190.721086] nfs: server
192.168.1.130 not responding, timed out
Jan 25 19:30:43 whoop-desktop kernel: [ 190.721113] NFS reply
statfs: -5
Jan 25 19:30:43 whoop-desktop kernel: [ 190.721116] nfs_statfs:
statfs error = 5
These series of traces are repeating over and over again at a set
interval (there is no flooding of the logs), even if I do nothing.
It's even worse than I thought because when I tried to shutdown, the
machine wouldn't shutdown because it claimed
the "File manager" was still running (although it was not visible on
screen); so I had to kill that before I could shutdown (properly).
In Fedora 12 I had a similar user experience (nautilus did show up
without showing any contents and it was hanging). I had enabled
tracing and it seems to be logged to /var/log/messages. I got this
output in fedora:
Jan 25 20:48:38 localhost kernel: NFS reply statfs: -5
Jan 25 20:48:38 localhost kernel: nfs_statfs: statfs error = 5
Jan 25 20:48:38 localhost kernel: NFS call fsstat
Jan 25 20:49:14 localhost kernel: nfs: server 192.168.1.130 not
responding, timed out
Jan 25 20:49:14 localhost kernel: NFS reply getattr: -5
Jan 25 20:49:14 localhost kernel: nfs_revalidate_inode: (0:14/74386)
getattr failed, error=-5
Jan 25 20:49:25 localhost kernel: NFS: revalidating (0:14/74386)
Jan 25 20:49:25 localhost kernel: NFS call getattr
Jan 25 20:50:14 localhost kernel: nfs: server 192.168.1.130 not
responding, timed out
Jan 25 20:50:14 localhost kernel: NFS reply access: -5
Jan 25 20:50:14 localhost kernel: NFS: permission(0:14/74386),
mask=0x1,
res=-5
Jan 25 20:50:14 localhost kernel: NFS call access
Jan 25 20:51:14 localhost kernel: nfs: server 192.168.1.130 not
responding, timed out
Jan 25 20:51:14 localhost kernel: NFS reply statfs: -5
Jan 25 20:51:14 localhost kernel: nfs_statfs: statfs error = 5
Jan 25 20:51:14 localhost kernel: NFS call fsstat
Most of the trace is repeating in set intervals as well, there is no
flooding of the logs...
Fedora would not shutdown normally either
This verifies that your client is attempting to access the NFS
server, but
doesn't tell us which file it's attempting to access. Essentially
the EIO
means "failed to connect".
Maybe try an strace of the nautilus process next?
On Mon, Jan 25, 2010 at 5:48 PM, Chuck Lever
<chuck.lever@xxxxxxxxxx>
wrote:
On Jan 24, 2010, at 7:09 PM, Whoop Whouzer wrote:
I did some network traces and there is nothing strange happening
as
far as I can tell. I shut down the server (some network traffic
occurred as is to be expected). It got quiet again, I launched
nautilus, it got stuck without displaying anything and there was
no
real network activity except 3 broadcasts using the ARP protocol
asking where the server was (could be just coincidence).
That sounds like the client does want to reconnect with the server.
You could try enabling debug tracing on your client (sudo
rpcdebug -m nfs
-s
all) after shutting down your server, then try to start
nautilus. The
kernel log would then contain NFS-related messages that might
indicate
where
to look next.
Closing
nautilus and launching it again will let it hang again but I see
no
additional network traffic. After a while nautilus will display
the
contents of the folder without any network traffic.
On Sun, Jan 24, 2010 at 10:34 PM, Muntz, Daniel <Dan.Muntz@xxxxxxxxxx
>
wrote:
Perhaps something in your $PATH is in the NFS mount? Do a
network
trace
and maybe you can see if, in fact, there are actually NFS
operations
being
attempted that you weren't expecting. Then try to figure out
why.
-Dan
-----Original Message-----
From: Whoop Whouzer [mailto:tiredandnumb@xxxxxxxxx]
Sent: Saturday, January 23, 2010 8:28 AM
To: Peter Chacko
Cc: linux-nfs@xxxxxxxxxxxxxxx
Subject: Re: nfs client performance while server is down
I don't remember all the different set-ups I tried it on, but
I just
confirmed this with the following combinations:
ubuntu server 10.04 (alpha 2) --> ubuntu desktop 9.10, ubuntu
desktop
10.04 (alpha 2), fedora 12
ubuntu server 9.10 --> ubuntu desktop 9.10, ubuntu desktop 10.04
(alpha 2), fedora 12
I'll be happy to test it on another client machine (distro) even
another server (although it would require a little more time)
Here are some examples on the bugreports I noticed and how
they do not
seem to get solved:
https://bugzilla.redhat.com/show_bug.cgi?id=175283
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/164120
regards,
Whoop
On Sat, Jan 23, 2010 at 4:57 PM, Peter Chacko
<peterchacko35@xxxxxxxxx> wrote:
Which client OS you observed this behavior ? This has
nothing to do
NFS design, and its purely stateless...Its upto the client OS
implementation about aspects like how to deal with local
IO, when NFS
share gets disconnected..
May be a VFS bug on the local OS you found this problem ..
thanks
On Sat, Jan 23, 2010 at 9:15 PM, Whoop Whouzer
<tiredandnumb@xxxxxxxxx> wrote:
Howdy,
I was wondering why nfs is designed in such a way that the
performance
of an nfs client machine gets very bad when the nfs server
is offline?
This is even the case with a soft mount (either via mount
or fstab).
Just about every application that requires disk access (not
talking
about nfs share acces) gets really slow to unresponsive.
For instance
nautilus becomes unresponsive when displaying the contents
of any
folder on the local disk,
playing movie files (stored on local disk) let totem or
vlc get stuck
on set intervals, even the terminal becomes unresponsive at
times.
I could understand that these problems would occur while
accessing the
nfs share directoiourry while the server is offline, but
why for totally
unrelated directories?
I have experienced this behaviour on various distro's, and
also found
various bug reports on this issue, they don't seem to get
solved as
this is viewed as nfs design.
I see this as a flaw because clients are totally dependent
on the
server. This would be less of a deal if the entire home
directory
would be stored on nfs (although I even think some sort of
synchronisation technology could and should be implemented
in this
case). It is a bit odd that (technically) one machine
serving some
"useless" files to a non-trivial directory on client
machines can take
down these client machines.
For me the preferred functionality would be:
*If an nfs server gets offline the client's nfs share becomes
unaccessible, but local directories and applications (that
only
require local disk access) stay responsive.
*If an nfs server gets online (after being offline while the
client
has not been restarted) the nfs share becomes reconnected.
regards,
Whoop
--
To unsubscribe from this list: send the line "unsubscribe
linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe
linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-
nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
<stracesdiff.log>
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html