Luminous cephfs maybe not to stable as expected?

"Marc Roos" <M.Roos@xxxxxxxxxxxxxxxxx> · Thu, 11 Jul 2019 10:58:01 +0200

Maybe this requires some attention. I have a default centos7 (maybe not 
the most recent kernel though), ceph luminous setup eg. no different 
kernels. 

This is 2nd or 3rd time that a vm is going into a high load (151) and 
stopping its services. I have two vm's both mounting the same 2 cephfs 
'shares'. After the last incident I dismounted the shares on the 2nd 
server. (Migrating to a new environment this 2nd server is not doing 
anything). Last time I thought maybe this could be related to my work on 
the switch from the stupid allocator to the bitmap.

Anyway yesterday I thought lets mount again the 2 shares on the 2nd 
server, see what happens. And this morning the high load was back. Afaik 
the 2nd server is only doing a cron job on the cephfs mounts, creating 
snapshots.

1) I have now still increased load on the osd nodes, from cephfs. How 
can I see what client is doing this? I don’t seem to get this from 
'ceph daemon mds.c session ls' however 'ceph osd pool stats | grep 
client -B 1' indicates it is cephfs.

2) ceph osd blacklist ls
No blacklist entries

3) the first server keeps generating such messages, while there is no 
issue with connectivity.

[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: osd25 192.168.10.114:6804 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: osd18 192.168.10.112:6802 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: osd22 192.168.10.111:6811 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
established

PS dmesg -T gives me strange times, as you can see these are in the 
future, os time is 2 min behind (which is the correct one, ntpd sync).
[@ ]# uptime
 10:39:17 up 50 days, 13:31,  2 users,  load average: 3.60, 3.02, 2.57

4) unmount the filesystem on the first server fails.

5) evicting the cephfs sessions of the first server, does not change the 
load of the cephfs on the osd nodes.

6) unmounting all cephfs clients, still leaves me with cephfs activity 
on the data pool and on the osd nodes.

[@c03 ~]# ceph daemon mds.c session ls
[] 

7) On the first server 
[@~]# ps -auxf| grep D
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      6716  3.0  0.0      0     0 ?        D    10:18   0:59  \_ 
[kworker/0:2]
root     20039  0.0  0.0 123520  1212 pts/0    D+   10:28   0:00  |      
 \_ umount /home/mail-archive/

[@ ~]# cat /proc/6716/stack
[<ffffffff8385e110>] __wait_on_freeing_inode+0xb0/0xf0
[<ffffffff8385e1e9>] find_inode+0x99/0xc0
[<ffffffff8385e281>] ilookup5_nowait+0x71/0x90
[<ffffffff8385f09f>] ilookup5+0xf/0x60
[<ffffffffc060fb35>] remove_session_caps+0xf5/0x1d0 [ceph]
[<ffffffffc06158fc>] dispatch+0x39c/0xb00 [ceph]
[<ffffffffc052afb4>] try_read+0x514/0x12c0 [libceph]
[<ffffffffc052bf64>] ceph_con_workfn+0xe4/0x1530 [libceph]
[<ffffffff836b9e3f>] process_one_work+0x17f/0x440
[<ffffffff836baed6>] worker_thread+0x126/0x3c0
[<ffffffff836c1d21>] kthread+0xd1/0xe0
[<ffffffff83d75c37>] ret_from_fork_nospec_end+0x0/0x39
[<ffffffffffffffff>] 0xffffffffffffffff

[@ ~]# cat /proc/20039/stack
[<ffffffff837b5e14>] __lock_page+0x74/0x90
[<ffffffff837c744c>] truncate_inode_pages_range+0x6cc/0x700
[<ffffffff837c74ef>] truncate_inode_pages_final+0x4f/0x60
[<ffffffff8385f02c>] evict+0x16c/0x180
[<ffffffff8385f87c>] iput+0xfc/0x190
[<ffffffff8385aa18>] shrink_dcache_for_umount_subtree+0x158/0x1e0
[<ffffffff8385c3bf>] shrink_dcache_for_umount+0x2f/0x60
[<ffffffff8384426f>] generic_shutdown_super+0x1f/0x100
[<ffffffff838446b2>] kill_anon_super+0x12/0x20
[<ffffffffc05ea130>] ceph_kill_sb+0x30/0x80 [ceph]
[<ffffffff83844a6e>] deactivate_locked_super+0x4e/0x70
[<ffffffff838451f6>] deactivate_super+0x46/0x60
[<ffffffff8386373f>] cleanup_mnt+0x3f/0x80
[<ffffffff838637d2>] __cleanup_mnt+0x12/0x20
[<ffffffff836be88b>] task_work_run+0xbb/0xe0
[<ffffffff8362bc65>] do_notify_resume+0xa5/0xc0
[<ffffffff83d76134>] int_signal+0x12/0x17
[<ffffffffffffffff>] 0xffffffffffffffff

What to do now? In ceph.conf I only have these entries, not sure if I 
still should keep them.

# 100k+ files in 2 folders
mds bal fragment size max = 120000
mds_session_blacklist_on_timeout = false
mds_session_blacklist_on_evict = false
mds_cache_memory_limit = 8000000000

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com