Re: "session established", "io error", "session lost, hunting for new mon" solution/fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 
Paul, this should have been/is back ported to this kernel not?


-----Original Message-----
From: Paul Emmerich [mailto:paul.emmerich@xxxxxxxx] 
Cc: ceph-users
Subject: Re:  "session established", "io error", "session 
lost, hunting for new mon" solution/fix

	 
	
	Anyone know why I would get these? Is it not strange to get them in 
a 
	'standard' setup?
	


you are probably running on an ancient kernel. this bug has been fixed a 
long time ago.


Paul

 






	-----Original Message-----
	Subject:  "session established", "io error", "session 
lost, 
	hunting for new mon" solution/fix
	
	
	I have on a cephfs client again (luminous cluster, centos7, only 32 

	osds!). Wanted to share the 'fix'
	
	[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
	established
	[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io 
error
	[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
	lost, hunting for new mon
	[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 
session 
	established
	[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 io 
error
	[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 
session 
	lost, hunting for new mon
	[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
	established
	[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io 
error
	[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
	lost, hunting for new mon
	[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 
session 
	established
	[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 io 
error
	[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 
session 
	lost, hunting for new mon
	
	1) I blocked client access to the monitors with
	iptables -I INPUT -p tcp -s 192.168.10.43 --dport 6789 -j REJECT
	Resulting in 
	
	[Thu Jul 11 12:34:16 2019] libceph: mon1 192.168.10.112:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:18 2019] libceph: mon1 192.168.10.112:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:22 2019] libceph: mon1 192.168.10.112:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:26 2019] libceph: mon2 192.168.10.113:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:27 2019] libceph: mon2 192.168.10.113:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:28 2019] libceph: mon2 192.168.10.113:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:30 2019] libceph: mon1 192.168.10.112:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:30 2019] libceph: mon2 192.168.10.113:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:34 2019] libceph: mon2 192.168.10.113:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:42 2019] libceph: mon2 192.168.10.113:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:44 2019] libceph: mon0 192.168.10.111:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:45 2019] libceph: mon0 192.168.10.111:6789 socket 

	closed (con state CONNECTING)
	[Thu Jul 11 12:34:46 2019] libceph: mon0 192.168.10.111:6789 socket 

	closed (con state CONNECTING)
	
	2) I applied the suggested changes to the osd map message max, 
mentioned 
	
	in early threads[0]
	ceph tell osd.* injectargs '--osd_map_message_max=10'
	ceph tell mon.* injectargs '--osd_map_message_max=10'
	[@c01 ~]# ceph daemon osd.0 config show|grep message_max
	    "osd_map_message_max": "10",
	[@c01 ~]# ceph daemon mon.a config show|grep message_max
	    "osd_map_message_max": "10",
	
	[0]
	https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg54419.htm
l
	http://tracker.ceph.com/issues/38040
	
	3) Allow access to a monitor with
	iptables -D INPUT -p tcp -s 192.168.10.43 --dport 6789 -j REJECT
	
	Getting 
	[Thu Jul 11 12:39:26 2019] libceph: mon0 192.168.10.111:6789 
session 
	established
	[Thu Jul 11 12:39:26 2019] libceph: osd0 down
	[Thu Jul 11 12:39:26 2019] libceph: osd0 up
	
	Problems solved, in D state hung unmount was released. 
	
	I am not sure if the prolonged disconnection to the monitors was 
the 
	solution or the osd_map_message_max=10, or both. 
	
	
	
	
	
	_______________________________________________
	ceph-users mailing list
	ceph-users@xxxxxxxxxxxxxx
	http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
	
	
	_______________________________________________
	ceph-users mailing list
	ceph-users@xxxxxxxxxxxxxx
	http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
	


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux