Re: nfsd4: utime sometimes takes 40+ seconds to return (but on SLES11SP3 with kernel 3.0.82)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 10.09.2013 um 22:35 schrieb "J. Bruce Fields" <bfields@xxxxxxxxxxxx>:

> On Tue, Sep 10, 2013 at 08:49:05PM +0200, Joschi Brauchle wrote:
>> Hello everyone,
>> 
>> we are administrating an NFS high-availability cluster running on
>> SLES11SP1 with kernel 2.6.32.59. Just recently, one of the cluster
>> machines was updated to SLES11SP3 with kernel 3.0.82.
>> 
>> 
>> We are now experiencing severe hangs on NFS clients when the
>> SLES11SP3 server is running the NFS services. An strace on the
>> hanging processes on the client side show that is is waiting up to
>> 60+ seconds for a "utime()" call to complete.
>> 
>> 
>> The problem we see is matching the problem described in the thread
>> "v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to
>> return". If the NFS server is running on SLES11SP3, the little test
>> program provided in this tread hangs at the "utime()" call for 60+
>> seconds. It hangs each time it is run! It finishes right away with 0
>> seconds delay is SLES11SP1 is providing NFS services, each time.
>> 
>> 
>> Now, in the serverside logfiles of SLES11SP3 we see these messages
>> (not so on SP1):
>> --------------
>> kernel: [99381.184976] RPC: AUTH_GSS upcall timed out.
>> kernel: [99381.184978] Please check user daemon is running.
>> --------------
>> 
>> We have always been running the NFS server without rpc.gssd on the
>> server side, as the init script for the nfsserver also does not
>> start rpc.gssd.
>> 
>> 
>> Once we started rpc.gssd on the SLES11SP3 server, using the test
>> utility on the client shows that the first call to "utime()"
>> succeeds right away, the second call takes ~25s to complete. But
>> now, any consecutive runs of the utility finish with no more delay.
>> 
>> 
>> So can anyone confirm that with kernel 3.0+ the rpc.gssd daemon is
>> also required on the server side for correct operation?
>> 
>> Has there been a change between kernel 2.6.32.59 and 3.0.x?
>> 
>> Thus, is the init script of the nfsserver in SLES11SP3 indeed
>> missing to start rpc.gssd?
> 
> It should be starting rpc.gssd to allow callbacks, yes.
> 
> --b.

Ok, we will run rpc.gssd on the server. Thanks. 

Could you please comment on having the nfs clients hang on utime() calls is to be expected when *not* running rpc.gssd? Or is this a problem that needs to be investigated?

Best regards,
J Brauchle

Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux