Re: Unexplained NFS mount hangs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Op maandag 13-04-2009 om 21:25 uur [tijdzone +0200], schreef Rudy
Zijlstra:
> Op maandag 13-04-2009 om 13:08 uur [tijdzone -0400], schreef Chuck
> Lever:
> > On Apr 13, 2009, at 12:47 PM, Daniel Stickney wrote:
> > 
> > > On Mon, 13 Apr 2009 12:12:47 -0400
> > > Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> > >
> > >> On Apr 13, 2009, at 11:24 AM, Daniel Stickney wrote:
> > >>> Hi all,
> > >>>
> > >>> I am investigating some NFS mount hangs that we have started to see
> > >>> over the past month on some of our servers. The behavior is that the
> > >>> client mount hangs and needs to be manually unmounted (forcefully
> > >>> with 'umount -f') and remounted to make it work. There are about 85
> > >>> clients mounting a partition over NFS. About 50 of the clients are
> > >>> running Fedora Core 3 with kernel 2.6.11-1.27_FC3smp. Not one of
> > >>> these 50 has ever had this mount hang. The other 35 are CentOS 5.2
> > >>> with kernel 2.6.27 which was compiled from source. The mount hangs
> > >>> are inconsistent and so far I don't know how to trigger them on
> > >>> demand. The timing of the hangs as noted by the timestamp in /var/
> > >>> log/messages varies. Not all of the 35 CentOS clients have their
> > >>> mounts hang at the same time, and the NFS server continues operating
> > >>> apparently normally for all other clients. Normally maybe 5 clients
> > >>> have a mount hang per week, on different days, mostly different
> > >>> times. Now and then we might see a cluster of a few clien
> > >>> ts have their mounts hang at the same exact time, but this is not
> > >>> consistent. In /var/log/messages we see


> OK, i'll switch to 2.6.30 on all clients once it is out. Prefer to wait
> for release, as they are production type machines. 
> 
> If i get a hang, i'll check with "netstat --ip"
> 

Just now one of my 2.6.28.7 machines is hanging. 
netstat results in client status: 
tcp  0  0 mythm.romunt.nl:1020    repeater.romunt.nl:nfsd FIN_WAIT2
tcp 76  0 mythm.romunt.nl:6544    repeater.romunt.n:53854 ESTABLISHED

 
and on the server i find:
tcp  1  0 repeater.romunt.nl:nfsd mythm.romunt.nl:1020    CLOSE_WAIT 
tcp  0  0 repeater.romunt.n:53854 mythm.romunt.nl:6544    FIN_WAIT2  


Cheers,

Rudy

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux