Re: NFS client stuck in a tight loop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sep 7, 2016, at 10:35 AM, Trond Myklebust wrote:

> Hi Oleg,
> 
> 
>> On Sep 6, 2016, at 18:42, Oleg Drokin <green@xxxxxxxxxxxxxx> wrote:
>> 
>> Hello!
>> 
>> I saw this in 4.6-rc and it's still there in 4.8-rc5 and judging by commit history
>> might have been there since all the way to 2008 (or something since then made it
>> possible).
>> 
>> Basically nfs client gets stuck in nfs4_reclaim_open_state() looping in:
>> 
>> restart:
>>       list_for_each_entry(state, &sp->so_states, open_states) {
>> …
>>               status = ops->recover_open(sp, state);
>>               if (status >= 0) {   // <=== Status is always 0 here in the loop
>> …
>>                               goto restart;
>> 		}
>> 	}
>> 
>> The -1 nfs debug looks like this looping over and over (note the
>> two different fh_crc values):
>> 
>> [10666.109326] NFS: nfs_update_inode(0:48/184 fh_crc=0x86636bbd ct=4 info=0x427e7f)
>> [10666.109327] NFS: nfs_fhget(0:48/184 fh_crc=0x86636bbd ct=4)
>> [10666.109329] nfs4_free_slot: slotid 0 highest_used_slotid 4294967295
>> [10666.109346] --> nfs4_setup_sequence clp ffff880098c86800 session ffff8800d6473800 sr_slot 4294967295
>> [10666.109347] --> nfs41_setup_sequence
>> [10666.109347] --> nfs4_alloc_slot used_slots=0000 highest_used=4294967295 max_slots=31
>> [10666.109348] <-- nfs4_alloc_slot used_slots=0001 highest_used=0 slotid=0
>> [10666.109348] <-- nfs41_setup_sequence slotid=0 seqid=40242184
>> [10666.109349] <-- nfs4_setup_sequence status=0
>> [10666.109357] encode_sequence: sessionid=1473188904:1:2:0 seqid=40242184 slotid=0 max_slotid=0 cache_this=1
>> [10666.109495] decode_attr_type: type=0100000
>> [10666.109495] decode_attr_change: change attribute=6327302647620605281
>> [10666.109496] decode_attr_size: file size=32592
>> [10666.109496] decode_attr_fsid: fsid=(0x0/0x0)
>> [10666.109497] decode_attr_fileid: fileid=33
>> [10666.109497] decode_attr_fs_locations: fs_locations done, error = 0
>> [10666.109497] decode_attr_mode: file mode=01752
>> [10666.109498] decode_attr_nlink: nlink=0
>> [10666.109499] decode_attr_owner: uid=-2
>> [10666.109499] decode_attr_group: gid=-2
>> [10666.109500] decode_attr_rdev: rdev=(0x0:0x0)
>> [10666.109500] decode_attr_space_used: space used=36864
>> [10666.109500] decode_attr_time_access: atime=1473189948
>> [10666.109501] decode_attr_time_metadata: ctime=1473189948
>> [10666.109501] decode_attr_time_modify: mtime=1473189948
>> [10666.109501] decode_attr_mounted_on_fileid: fileid=33
>> [10666.109502] decode_getfattr_attrs: xdr returned 0
>> [10666.109502] decode_getfattr_generic: xdr returned 0
>> [10666.109503] nfs41_handle_sequence_flag_errors: "localhost" (client ID 2814cf5701000000) flags=0x00000040
>> [10666.109514] nfs41_handle_some_state_revoked: state revoked on server localhost
>> [10666.109514] nfs41_handle_recallable_state_revoked: Recallable state revoked on server localhost!
>> [10666.109515] --> nfs4_alloc_slot used_slots=0001 highest_used=0 max_slots=31
>> [10666.109516] <-- nfs4_alloc_slot used_slots=0003 highest_used=1 slotid=1
>> [10666.109516] nfs4_free_slot: slotid 1 highest_used_slotid 0
>> [10666.109517] nfs41_sequence_process: Error 0 free the slot 
>> [10666.109547] NFS: nfs_update_inode(0:48/33 fh_crc=0x7318ec88 ct=3 info=0x427e7f)
>> [10666.109548] NFS: nfs_fhget(0:48/33 fh_crc=0x7318ec88 ct=3)
>> [10666.109551] nfs4_free_slot: slotid 0 highest_used_slotid 4294967295
>> [10666.109568] --> nfs4_setup_sequence clp ffff880098c86800 session ffff8800d6473800 sr_slot 4294967295
>> [10666.109568] --> nfs41_setup_sequence
>> [10666.109569] --> nfs4_alloc_slot used_slots=0000 highest_used=4294967295 max_slots=31
>> [10666.109569] <-- nfs4_alloc_slot used_slots=0001 highest_used=0 slotid=0
>> [10666.109570] <-- nfs41_setup_sequence slotid=0 seqid=40242185
>> [10666.109570] <-- nfs4_setup_sequence status=0
>> [10666.109577] encode_sequence: sessionid=1473188904:1:2:0 seqid=40242185 slotid=0 max_slotid=0 cache_this=1
>> [10666.109710] decode_attr_type: type=0100000
>> [10666.109710] decode_attr_change: change attribute=6327303042766923666
>> [10666.109711] decode_attr_size: file size=4096
>> [10666.109711] decode_attr_fsid: fsid=(0x0/0x0)
>> [10666.109711] decode_attr_fileid: fileid=184
>> [10666.109712] decode_attr_fs_locations: fs_locations done, error = 0
>> [10666.109712] decode_attr_mode: file mode=03461
>> [10666.109712] decode_attr_nlink: nlink=0
>> [10666.109713] decode_attr_owner: uid=-2
>> [10666.109714] decode_attr_group: gid=-2
>> [10666.109714] decode_attr_rdev: rdev=(0x0:0x0)
>> [10666.109715] decode_attr_space_used: space used=4096
>> [10666.109715] decode_attr_time_access: atime=1473189947
>> [10666.109715] decode_attr_time_metadata: ctime=1473190040
>> [10666.109716] decode_attr_time_modify: mtime=1473190040
>> [10666.109716] decode_attr_mounted_on_fileid: fileid=184
>> [10666.109717] decode_getfattr_attrs: xdr returned 0
>> [10666.109717] decode_getfattr_generic: xdr returned 0
>> [10666.109718] nfs41_handle_sequence_flag_errors: "localhost" (client ID 2814cf5701000000) flags=0x00000040
>> [10666.109729] nfs41_handle_some_state_revoked: state revoked on server localhost
>> [10666.109729] nfs41_handle_recallable_state_revoked: Recallable state revoked on server localhost!
>> [10666.109730] --> nfs4_alloc_slot used_slots=0001 highest_used=0 max_slots=31
>> [10666.109730] <-- nfs4_alloc_slot used_slots=0003 highest_used=1 slotid=1
>> [10666.109731] nfs4_free_slot: slotid 1 highest_used_slotid 0
>> [10666.109732] nfs41_sequence_process: Error 0 free the slot 
>> 
>> 
>> Server side messages at -1 level look like this (from another instance):
>> 
>> [154944.451610] renewing client (clientid 57ce4a54/00000031)
>> [154944.451740] nfsd_dispatch: vers 4 proc 1
>> [154944.451741] nfsv4 compound op #1/5: 53 (OP_SEQUENCE)
>> [154944.451743] __find_in_sessionid_hashtbl: 1473137236:49:50:0
>> [154944.451744] nfsd4_sequence: slotid 0
>> [154944.451745] check_slot_seqid enter. seqid 355302832 slot_seqid 355302831
>> [154944.451747] nfsv4 compound op ffff8800b0664080 opcnt 5 #1: 53: status 0
>> [154944.451748] nfsv4 compound op #2/5: 22 (OP_PUTFH)
>> [154944.451749] nfsd: fh_verify(16: 01010001 00000000 00000324 30202305 00000000 00000000)
>> [154944.451757] nfsv4 compound op ffff8800b0664080 opcnt 5 #2: 22: status 0
>> [154944.451758] nfsv4 compound op #3/5: 18 (OP_OPEN)
>> [154944.451759] NFSD: nfsd4_open filename  op_openowner           (null)
>> [154944.451768] nfsd: fh_verify(16: 01010001 00000000 00000324 30202305 00000000 00000000)
>> [154944.451773] nfsd4_process_open2: stateid=(57ce4a54/00000031/00000247/0a96b4a5)
>> [154944.451788] nfsv4 compound op ffff8800b0664080 opcnt 5 #3: 18: status 0
>> [154944.451788] nfsv4 compound op #4/5: 3 (OP_ACCESS)
>> [154944.451790] nfsd: fh_verify(16: 01010001 00000000 00000324 30202305 00000000 00000000)
>> [154944.451794] nfsv4 compound op ffff8800b0664080 opcnt 5 #4: 3: status 0
>> [154944.451795] nfsv4 compound op #5/5: 9 (OP_GETATTR)
>> [154944.451796] nfsd: fh_verify(16: 01010001 00000000 00000324 30202305 00000000 00000000)
>> [154944.451800] nfsv4 compound op ffff8800b0664080 opcnt 5 #5: 9: status 0
>> [154944.451801] nfsv4 compound returned 0
>> [154944.451802] --> nfsd4_store_cache_entry slot ffff8800a7629000
>> [154944.451803] renewing client (clientid 57ce4a54/00000031)
>> [154944.452019] nfsd_dispatch: vers 4 proc 1
>> [154944.452020] nfsv4 compound op #1/5: 53 (OP_SEQUENCE)
>> [154944.452022] __find_in_sessionid_hashtbl: 1473137236:49:50:0
>> [154944.452023] nfsd4_sequence: slotid 0
>> [154944.452024] check_slot_seqid enter. seqid 355302833 slot_seqid 355302832
>> [154944.452026] nfsv4 compound op ffff8800aa787080 opcnt 5 #1: 53: status 0
>> [154944.452026] nfsv4 compound op #2/5: 22 (OP_PUTFH)
>> [154944.452028] nfsd: fh_verify(16: 01010001 00000000 000002f4 302021dd 00000000 00000000)
>> [154944.452037] nfsv4 compound op ffff8800aa787080 opcnt 5 #2: 22: status 0
>> [154944.452037] nfsv4 compound op #3/5: 18 (OP_OPEN)
>> [154944.452038] NFSD: nfsd4_open filename  op_openowner           (null)
>> [154944.452054] nfsd: fh_verify(16: 01010001 00000000 000002f4 302021dd 00000000 00000000)
>> [154944.452059] nfsd4_process_open2: stateid=(57ce4a54/00000031/00000246/0a96b4a5)
>> [154944.452076] nfsv4 compound op ffff8800aa787080 opcnt 5 #3: 18: status 0
>> [154944.452076] nfsv4 compound op #4/5: 3 (OP_ACCESS)
>> [154944.452078] nfsd: fh_verify(16: 01010001 00000000 000002f4 302021dd 00000000 00000000)
>> [154944.452081] nfsv4 compound op ffff8800aa787080 opcnt 5 #4: 3: status 0
>> [154944.452082] nfsv4 compound op #5/5: 9 (OP_GETATTR)
>> [154944.452083] nfsd: fh_verify(16: 01010001 00000000 000002f4 302021dd 00000000 00000000)
>> [154944.452087] nfsv4 compound op ffff8800aa787080 opcnt 5 #5: 9: status 0
>> [154944.452088] nfsv4 compound returned 0
>> [154944.452088] --> nfsd4_store_cache_entry slot ffff8800a7629000
>> 
>> 
>> This is fairly easy to reproduce (though takes some time).
>> Once client enters this state, some kernel threads are stuck at 100% cpu,
>> there's a bunch of processes in D state and the mountpoint is pinned
> 
> Can you explain how you reproduce the problem? I can see what is happening, but I’m still not understanding why the server is getting into this state in the first place.

Basically it's the same old racer test as before.
http://git.whamcloud.com/fs/lustre-release.git/tree/refs/heads/master:/lustre/tests/racer

It runs a bunch of conflicting operations in a very limited namespace so
lots of processes step on each other toes all the time, a nice stress test.

I have ext4 mounted that I export via nfs.
Then I locally mount that (probably works with remote mounts too, but less convenient
to test) twice, once as -t nfs and once as -t nfs4
Then I run 3 instances of that racer thing, one per each mountpoint
(so two nfs mountpoints and one ext4 mountpoint) with a time duration of 1 hour.
At the end of the racer runs I try to unmount everything
and pretty often the nfs4 unmount fails and the node is in that state described.

The actual script (geared towards my nfsroot environment so has extra fluff to
make daemons start and such that you can ignore):
[root@fedora1 ~]# cat ~/stress.sh 
#!/bin/bash

set -x

# Replace below with actual path to racer tree
cd /home/green/git/lustre-release/lustre/tests/racer
dd if=/dev/zero of=/tmp/loop bs=1024k count=1024
mkfs.ext4 /tmp/loop
service rpcbind start
mount none /var/lib/nfs -t tmpfs
touch /var/lib/nfs/etab
service nfs-mountd start

while :; do

date

mount /tmp/loop /mnt/lustre -o loop || exit 1
mkdir /mnt/lustre/racer
service nfs-server start || exit 2
mount localhost:/mnt/lustre /mnt/nfs -t nfs -o nolock || exit 3
mount localhost:/ /mnt/nfs2 -t nfs4 || exit 4
DURATION=3600 sh racer.sh /mnt/nfs/racer &
DURATION=3600 sh racer.sh /mnt/nfs2/racer &
DURATION=3600 sh racer.sh /mnt/lustre/racer &
wait %1 %2 %3
umount /mnt/nfs || exit 5
umount /mnt/nfs2 || exit 6
service nfs-server stop || exit 7
umount /mnt/lustre || exit 8
e2fsck -n -f /tmp/loop || exit 9
done

The /etc/exports looks like this:
$ cat /etc/exports
/mnt/lustre *(rw,no_root_squash,fsid=0)

>> You can do umount -f and then unmount does work on the second try,
>> this also results in the following dmesg message:
>> [155381.995368] NFS: nfs4_reclaim_open_state: unhandled error -5
>> 
>> I don't know this code well enough to take my own stab at a fix, but I can
>> help testing if needed.
> 
> I did send out a set of patches a week ago that addresses an issue in this area, and I know of a couple more holes that need to be plugged.

Do you want me to try the tip of your trond/testing branch to see if it's any better?

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux