Re: nfs client performance while server is down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hhm, now that I look closer, maybe there are no real differences at
all. Just different address allocations...

On Mon, Jan 25, 2010 at 8:38 PM, Whoop Whouzer <tiredandnumb@xxxxxxxxx> wrote:
> Running  "strace nautilus" gives me allot of output. When I run it
> while the server is down it completes the trace without a hiccup, it
> returns and than nautilus is launched and hangs.
> There are differences between the traces (with server up and server
> down). I can't really see where the problem lies in there.
>
> On Mon, Jan 25, 2010 at 8:08 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>> On Jan 25, 2010, at 2:02 PM, Whoop Whouzer wrote:
>>>
>>> Ok, I did that, after shutting down the server and enabling debug
>>> trace I tried to open the home folder of the current account (totally
>>> unrelated to the nfsshare), it wouldn't open at all, I got no nautilus
>>> at all. During the time my cursor was in busy mode I got the following
>>> messages in kern.log (for ubuntu 10.04 client):
>>> Jan 25 19:30:13 whoop-desktop kernel: [  160.719262] NFS call  fsstat
>>> Jan 25 19:30:37 whoop-desktop kernel: [  184.458611] NFS:
>>> permission(0:16/74386), mask=0x10, res=0
>>> Jan 25 19:30:37 whoop-desktop kernel: [  184.458647] NFS call  access
>>> Jan 25 19:30:43 whoop-desktop kernel: [  190.721086] nfs: server
>>> 192.168.1.130 not responding, timed out
>>> Jan 25 19:30:43 whoop-desktop kernel: [  190.721113] NFS reply statfs: -5
>>> Jan 25 19:30:43 whoop-desktop kernel: [  190.721116] nfs_statfs:
>>> statfs error = 5
>>> These series of traces are repeating over and over again at a set
>>> interval (there is no flooding of the logs), even if I do nothing.
>>> It's even worse than I thought because when I tried to shutdown, the
>>> machine wouldn't shutdown because it claimed
>>> the "File manager" was still running (although it was not visible on
>>> screen); so I had to kill that before I could shutdown (properly).
>>>
>>> In Fedora 12 I had a similar user experience (nautilus did show up
>>> without showing any contents and it was hanging). I had enabled
>>> tracing and it seems to be logged to /var/log/messages. I got this
>>> output in fedora:
>>> Jan 25 20:48:38 localhost kernel: NFS reply statfs: -5
>>> Jan 25 20:48:38 localhost kernel: nfs_statfs: statfs error = 5
>>> Jan 25 20:48:38 localhost kernel: NFS call  fsstat
>>> Jan 25 20:49:14 localhost kernel: nfs: server 192.168.1.130 not
>>> responding, timed out
>>> Jan 25 20:49:14 localhost kernel: NFS reply getattr: -5
>>> Jan 25 20:49:14 localhost kernel: nfs_revalidate_inode: (0:14/74386)
>>> getattr failed, error=-5
>>> Jan 25 20:49:25 localhost kernel: NFS: revalidating (0:14/74386)
>>> Jan 25 20:49:25 localhost kernel: NFS call  getattr
>>> Jan 25 20:50:14 localhost kernel: nfs: server 192.168.1.130 not
>>> responding, timed out
>>> Jan 25 20:50:14 localhost kernel: NFS reply access: -5
>>> Jan 25 20:50:14 localhost kernel: NFS: permission(0:14/74386), mask=0x1,
>>> res=-5
>>> Jan 25 20:50:14 localhost kernel: NFS call  access
>>> Jan 25 20:51:14 localhost kernel: nfs: server 192.168.1.130 not
>>> responding, timed out
>>> Jan 25 20:51:14 localhost kernel: NFS reply statfs: -5
>>> Jan 25 20:51:14 localhost kernel: nfs_statfs: statfs error = 5
>>> Jan 25 20:51:14 localhost kernel: NFS call  fsstat
>>> Most of the trace is repeating in set intervals as well, there is no
>>> flooding of the logs...
>>> Fedora would not shutdown normally either
>>
>> This verifies that your client is attempting to access the NFS server, but
>> doesn't tell us which file it's attempting to access.  Essentially the EIO
>> means "failed to connect".
>>
>> Maybe try an strace of the nautilus process next?
>>
>>> On Mon, Jan 25, 2010 at 5:48 PM, Chuck Lever <chuck.lever@xxxxxxxxxx>
>>> wrote:
>>>>
>>>> On Jan 24, 2010, at 7:09 PM, Whoop Whouzer wrote:
>>>>>
>>>>> I did some network traces and there is nothing strange happening as
>>>>> far as I can tell. I shut down the server (some network traffic
>>>>> occurred as is to be expected). It got quiet again, I launched
>>>>> nautilus, it got stuck without displaying anything and there was no
>>>>> real network activity except 3 broadcasts using the ARP protocol
>>>>> asking where the server was (could be just coincidence).
>>>>
>>>> That sounds like the client does want to reconnect with the server.
>>>>
>>>> You could try enabling debug tracing on your client (sudo rpcdebug -m nfs
>>>> -s
>>>> all) after shutting down your server, then try to start nautilus.  The
>>>> kernel log would then contain NFS-related messages that might indicate
>>>> where
>>>> to look next.
>>>>
>>>>> Closing
>>>>> nautilus and launching it again will let it hang again but I see no
>>>>> additional network traffic. After a while nautilus will display the
>>>>> contents of the folder without any network traffic.
>>>>>
>>>>> On Sun, Jan 24, 2010 at 10:34 PM, Muntz, Daniel <Dan.Muntz@xxxxxxxxxx>
>>>>> wrote:
>>>>>>
>>>>>> Perhaps something in your $PATH is in the NFS mount?  Do a network
>>>>>> trace
>>>>>> and maybe you can see if, in fact, there are actually NFS operations
>>>>>> being
>>>>>> attempted that you weren't expecting.  Then try to figure out why.
>>>>>>
>>>>>>  -Dan
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Whoop Whouzer [mailto:tiredandnumb@xxxxxxxxx]
>>>>>>> Sent: Saturday, January 23, 2010 8:28 AM
>>>>>>> To: Peter Chacko
>>>>>>> Cc: linux-nfs@xxxxxxxxxxxxxxx
>>>>>>> Subject: Re: nfs client performance while server is down
>>>>>>>
>>>>>>> I don't remember all the different set-ups I tried it on, but I just
>>>>>>> confirmed this with the following combinations:
>>>>>>>
>>>>>>> ubuntu server 10.04 (alpha 2) --> ubuntu desktop 9.10, ubuntu desktop
>>>>>>> 10.04 (alpha 2), fedora 12
>>>>>>> ubuntu server 9.10 --> ubuntu desktop 9.10, ubuntu desktop 10.04
>>>>>>> (alpha 2), fedora 12
>>>>>>>
>>>>>>> I'll be happy to test it on another client machine (distro) even
>>>>>>> another server (although it would require a little more time)
>>>>>>>
>>>>>>> Here are some examples on the bugreports I noticed and how they do not
>>>>>>> seem to get solved:
>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=175283
>>>>>>> https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/164120
>>>>>>>
>>>>>>> regards,
>>>>>>> Whoop
>>>>>>>
>>>>>>> On Sat, Jan 23, 2010 at 4:57 PM, Peter Chacko
>>>>>>> <peterchacko35@xxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>> Which client OS you observed this behavior ?  This has nothing to do
>>>>>>>> NFS design, and its purely stateless...Its upto the client OS
>>>>>>>> implementation about aspects like how to deal with local
>>>>>>>
>>>>>>> IO, when NFS
>>>>>>>>
>>>>>>>> share gets  disconnected..
>>>>>>>>
>>>>>>>> May be a VFS bug on the local OS you found this problem ..
>>>>>>>>
>>>>>>>> thanks
>>>>>>>>
>>>>>>>> On Sat, Jan 23, 2010 at 9:15 PM, Whoop Whouzer
>>>>>>>
>>>>>>> <tiredandnumb@xxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>> Howdy,
>>>>>>>>>
>>>>>>>>> I was wondering why nfs is designed in such a way that the
>>>>>>>
>>>>>>> performance
>>>>>>>>>
>>>>>>>>> of an nfs client machine gets very bad when the nfs server
>>>>>>>
>>>>>>> is offline?
>>>>>>>>>
>>>>>>>>> This is even the case with a soft mount (either via mount
>>>>>>>
>>>>>>> or fstab).
>>>>>>>>>
>>>>>>>>> Just about every application that requires disk access (not talking
>>>>>>>>> about nfs share acces) gets really slow to unresponsive.
>>>>>>>
>>>>>>> For instance
>>>>>>>>>
>>>>>>>>> nautilus becomes unresponsive when displaying the contents of any
>>>>>>>>> folder on the local disk,
>>>>>>>>> playing movie files (stored on local disk) let totem or
>>>>>>>
>>>>>>> vlc get stuck
>>>>>>>>>
>>>>>>>>> on set intervals, even the terminal becomes unresponsive at times.
>>>>>>>>>
>>>>>>>>> I could understand that these problems would occur while
>>>>>>>
>>>>>>> accessing the
>>>>>>>>>
>>>>>>>>> nfs share directoiourry while the server is offline, but
>>>>>>>
>>>>>>> why for totally
>>>>>>>>>
>>>>>>>>> unrelated directories?
>>>>>>>>>
>>>>>>>>> I have experienced this behaviour on various distro's, and
>>>>>>>
>>>>>>> also found
>>>>>>>>>
>>>>>>>>> various bug reports on this issue, they don't seem to get solved as
>>>>>>>>> this is viewed as nfs design.
>>>>>>>>> I see this as a flaw because clients are totally dependent on the
>>>>>>>>> server. This would be less of a deal if the entire home directory
>>>>>>>>> would be stored on nfs (although I even think some sort of
>>>>>>>>> synchronisation technology could and should be implemented in this
>>>>>>>>> case). It is a bit odd that (technically) one machine serving some
>>>>>>>>> "useless" files to a non-trivial directory on client
>>>>>>>
>>>>>>> machines can take
>>>>>>>>>
>>>>>>>>> down these client machines.
>>>>>>>>>
>>>>>>>>> For me the preferred functionality would be:
>>>>>>>>> *If an nfs server gets offline the client's nfs share becomes
>>>>>>>>> unaccessible, but local directories and applications (that only
>>>>>>>>> require local disk access) stay responsive.
>>>>>>>>> *If an nfs server gets online (after being offline while the client
>>>>>>>>> has not been restarted) the nfs share becomes reconnected.
>>>>>>>>>
>>>>>>>>> regards,
>>>>>>>>> Whoop
>>>>>>>>> --
>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>>
>>>>>>> linux-nfs" in
>>>>>>>>>
>>>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>> linux-nfs" in
>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>> --
>>>> Chuck Lever
>>>> chuck[dot]lever[at]oracle[dot]com
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>> --
>> Chuck Lever
>> chuck[dot]lever[at]oracle[dot]com
>>
>>
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux