Am 10.09.2013 um 22:35 schrieb "J. Bruce Fields" <bfields@xxxxxxxxxxxx>: > On Tue, Sep 10, 2013 at 08:49:05PM +0200, Joschi Brauchle wrote: >> Hello everyone, >> >> we are administrating an NFS high-availability cluster running on >> SLES11SP1 with kernel 2.6.32.59. Just recently, one of the cluster >> machines was updated to SLES11SP3 with kernel 3.0.82. >> >> >> We are now experiencing severe hangs on NFS clients when the >> SLES11SP3 server is running the NFS services. An strace on the >> hanging processes on the client side show that is is waiting up to >> 60+ seconds for a "utime()" call to complete. >> >> >> The problem we see is matching the problem described in the thread >> "v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to >> return". If the NFS server is running on SLES11SP3, the little test >> program provided in this tread hangs at the "utime()" call for 60+ >> seconds. It hangs each time it is run! It finishes right away with 0 >> seconds delay is SLES11SP1 is providing NFS services, each time. >> >> >> Now, in the serverside logfiles of SLES11SP3 we see these messages >> (not so on SP1): >> -------------- >> kernel: [99381.184976] RPC: AUTH_GSS upcall timed out. >> kernel: [99381.184978] Please check user daemon is running. >> -------------- >> >> We have always been running the NFS server without rpc.gssd on the >> server side, as the init script for the nfsserver also does not >> start rpc.gssd. >> >> >> Once we started rpc.gssd on the SLES11SP3 server, using the test >> utility on the client shows that the first call to "utime()" >> succeeds right away, the second call takes ~25s to complete. But >> now, any consecutive runs of the utility finish with no more delay. >> >> >> So can anyone confirm that with kernel 3.0+ the rpc.gssd daemon is >> also required on the server side for correct operation? >> >> Has there been a change between kernel 2.6.32.59 and 3.0.x? >> >> Thus, is the init script of the nfsserver in SLES11SP3 indeed >> missing to start rpc.gssd? > > It should be starting rpc.gssd to allow callbacks, yes. > > --b. Ok, we will run rpc.gssd on the server. Thanks. Could you please comment on having the nfs clients hang on utime() calls is to be expected when *not* running rpc.gssd? Or is this a problem that needs to be investigated? Best regards, J Brauchle
Attachment:
smime.p7s
Description: S/MIME cryptographic signature