Hi guys, I'm running the following Setup on Ubuntu 14.04 for both Server and Clients: == NFS Server with /etc/exports: /var/www/ 172.16.1.254(rw,no_root_squash,sync,no_subtree_check) 172.16.1.184(rw,no_root_squash,sync,no_subtree_check) 172.16.0.120(rw,no_root_squash,sync,no_subtree_check) 172.16.0.193(rw,no_root_squash,sync,no_subtree_check) Version: 1:1.2.8-6ubuntu1.2 == Four NFS Clients with fstab: alpha:/var/www /var/www nfs4 nosharecache,fsc=example_web,noatime,tcp,bg,nosuid,rsize=32768,wsize=32768,soft,proto=tcp 0 0 On the Clients i'm using cachefilesd: /var/cache/cachefilesd/loopimage.img /var/cache/cachefilesd/srv ext4 loop,rw,relatime,errors=continue,user_xattr,acl,barrier=1,data=ordered 0 0 root@web1:~# cat /etc/cachefilesd.conf dir /var/cache/cachefilesd/srv tag nfs_filesystem_cache brun 20% frun 10% bcull 10% fcull 7% bstop 5% fstop 3% == Problem Both server and clients experience random kernel Panics. Of the five machines, around one dies per die. They all run on Amazon AWS as m4.large instances. When I set rpcdebug -m nfsd -s all rpcdebug -m rpc -s all The messages before the crash (this time on the NFS server) are: ``` Nov 30 13:49:54 nfs-master kernel: [38232.649545] nfsd_dispatch: vers 4 proc 1 Nov 30 13:49:54 nfs-master kernel: [38232.649547] nfsv4 compound op #1/3: 22 (OP_PUTFH) Nov 30 13:49:54 nfs-master kernel: [38232.649548] nfsd: fh_verify(32: 81060001 0c7791ab ab46dd87 663ae28a 6877949f 2802898e) Nov 30 13:49:54 nfs-master kernel: [38232.649552] nfsv4 compound op ffff8802026c8080 opcnt 3 #1: 22: status 0 Nov 30 13:49:54 nfs-master kernel: [38232.649553] nfsv4 compound op #2/3: 4 (OP_CLOSE) Nov 30 13:49:54 nfs-master kernel: [38232.649554] NFSD: nfsd4_close on file objectLinksShadow.png Nov 30 13:49:54 nfs-master kernel: [38232.649556] NFSD: nfs4_preprocess_seqid_op: seqid=818421 stateid = (565bb0a0/00000001/00083f05/00000001) Nov 30 13:49:54 nfs-master kernel: [38232.649557] renewing client (clientid 565bb0a0/00000001) Nov 30 13:49:54 nfs-master kernel: [38232.649558] NFSD: move_to_close_lru nfs4_openowner ffff8800373b8000 Nov 30 13:49:54 nfs-master kernel: [38232.649559] nfsv4 compound op ffff8802026c8080 opcnt 3 #2: 4: status 0 Nov 30 13:49:54 nfs-master kernel: [38232.649560] nfsv4 compound op #3/3: 9 (OP_GETATTR) Nov 30 13:49:54 nfs-master kernel: [38232.649562] nfsd: fh_verify(32: 81060001 0c7791ab ab46dd87 663ae28a 6877949f 2802898e) Nov 30 13:49:54 nfs-master kernel: [38232.649564] nfsv4 compound op ffff8802026c8080 opcnt 3 #3: 9: status 0 Nov 30 13:49:54 nfs-master kernel: [38232.649565] nfsv4 compound returned 0 Nov 30 13:49:54 nfs-master kernel: [38232.649570] svc: socket ffff8800e929d000 sendto([ffff8801e07ae000 136... ], 136) = 136 (addr 172.16.0.120, port=958) Nov 30 13:49:54 nfs-master kernel: [38232.649571] svc: server ffff880202142000 waiting for data (to = 900000) Nov 30 13:49:54 nfs-master rsyslogd: [origin software="rsyslogd" swVersion="7.4.4" x-pid="939" x-info="http://www.rsyslog.com"] exiting on signal 15. Server is rebooting here Nov 30 13:50:34 nfs-master rsyslogd: [origin software="rsyslogd" swVersion="7.4.4" x-pid="951" x-info="http://www.rsyslog.com"] start Nov 30 13:50:34 nfs-master rsyslogd-2307: warning: ~ action is deprecated, consider using the 'stop' statement instead [try http://www.rsyslog.com/e/2307 ] Nov 30 13:50:34 nfs-master rsyslogd: rsyslogd's groupid changed to 104 Nov 30 13:50:34 nfs-master rsyslogd: rsyslogd's userid changed to 101 Nov 30 13:50:34 nfs-master kernel: [ 0.000000] Initializing cgroup subsys cpuset Nov 30 13:50:34 nfs-master kernel: [ 0.000000] Initializing cgroup subsys cpu Nov 30 13:50:34 nfs-master kernel: [ 0.000000] Initializing cgroup subsys cpuacct ``` -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html