Re: Client Timeout on Rados Gateway

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 7, 2013 at 1:35 PM, Gruher, Joseph R
<joseph.r.gruher@xxxxxxxxx> wrote:
> Thanks for the reply.  This eventually resolved itself when I upgraded the client kernel from the Ubuntu Server 12.04.2 default to the 3.6.10 kernel.  Not sure if there is a good causal explanation there or if it might be a coincidence.  I did see the kernel recommendations in the docs but I had assumed those just applied to the Ceph machines and not clients - perhaps that is a bad assumption.

The kernel should not matter for clients; no. The only other place I
could find that error string was a result of a version mismatch large
enough that it passed over an incompatible encoding change we hadn't
handled appropriately, so I was thinking that maybe your client was
using a very old repository. Glad to hear it seems to have worked
itself out!
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

>  Anyway, it works now, so I guess the next steps are to try moving the client back to the public network and to re-enable authentication and see if it works or if I still have an issue there.
>
> With regard to versions:
>
> ceph@cephtest06:/etc/ceph$ ceph-mon --version
> ceph version 0.67.3 (408cd61584c72c0d97b774b3d8f95c6b1b06341a)
>
> ceph@cephtest06:/etc/ceph$ uname -a
> Linux cephtest06 3.6.10-030610-generic #201212101650 SMP Mon Dec 10 21:51:40 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
>
> ceph@cephclient01:~/cos$ rados --version
> ceph version 0.67.3 (408cd61584c72c0d97b774b3d8f95c6b1b06341a)
>
> ceph@cephclient01:~/cos$ uname -a
> Linux cephclient01 3.6.10-030610-generic #201212101650 SMP Mon Dec 10 21:51:40 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
>
> Thanks,
> Joe
>
>>-----Original Message-----
>>From: Gregory Farnum [mailto:greg@xxxxxxxxxxx]
>>Sent: Monday, October 07, 2013 1:27 PM
>>To: Gruher, Joseph R
>>Cc: ceph-users@xxxxxxxxxxxxxx
>>Subject: Re:  Client Timeout on Rados Gateway
>>
>>The ping tests you're running are connecting to different interfaces
>>(10.23.37.175) than those you specify in the "mon_hosts" option (10.0.0.2,
>>10.0.0.3, 10.0.0.4). The client needs to be able to connect to the specified
>>address; I'm guessing it's not routable from outside that network?
>>
>>The error you're getting once you put it inside the network is more
>>interesting. What version of the Ceph packages do you have installed there,
>>and what's installed on the monitors? (run "ceph-mon --version"
>>on the monitor, and "rados --version" on the client, and it'll
>>output.)
>>-Greg
>>Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>On Tue, Oct 1, 2013 at 12:45 PM, Gruher, Joseph R
>><joseph.r.gruher@xxxxxxxxx> wrote:
>>> Hello-
>>>
>>>
>>>
>>> I've set up a rados gateway but I'm having trouble accessing it from
>>> clients.  I can access it using rados command line just fine from any
>>> system in my ceph deployment, including my monitors and OSDs, the
>>> gateway system, and even the admin system I used to run ceph-deploy.
>>> However, when I set up a client outside the ceph nodes I get a timeout
>>> error as shown at the bottom of the output pasted below.  I've turned
>>> off authentication for the moment to simplify things.  Systems are
>>> able to resolve names and reach each other via ping.  Any thoughts on what
>>could be the issue here or how to debug?
>>>
>>>
>>>
>>> The failure:
>>>
>>>
>>>
>>> ceph@cephclient01:/etc/ceph$ rados df
>>>
>>> 2013-10-01 19:57:07.488970 7fd381db0780 monclient(hunting):
>>> authenticate timed out after 30
>>>
>>> 2013-10-01 19:57:07.489174 7fd381db0780 librados: client.admin
>>> authentication error (110) Connection timed out
>>>
>>> couldn't connect to cluster! error -110
>>>
>>>
>>>
>>>
>>>
>>> ceph@cephclient01:/etc/ceph$ sudo rados df
>>>
>>> 2013-10-01 19:57:44.461273 7fb6712d5780 monclient(hunting):
>>> authenticate timed out after 30
>>>
>>> 2013-10-01 19:57:44.461440 7fb6712d5780 librados: client.admin
>>> authentication error (110) Connection timed out
>>>
>>> couldn't connect to cluster! error -110
>>>
>>> ceph@cephclient01:/etc/ceph$
>>>
>>>
>>>
>>>
>>>
>>> Some details from the client:
>>>
>>>
>>>
>>> ceph@cephclient01:/etc/ceph$ pwd
>>>
>>> /etc/ceph
>>>
>>>
>>>
>>>
>>>
>>> ceph@cephclient01:/etc/ceph$ ls
>>>
>>> ceph.client.admin.keyring  ceph.conf  keyring.radosgw.gateway
>>>
>>>
>>>
>>>
>>>
>>> ceph@cephclient01:/etc/ceph$ cat ceph.conf
>>>
>>> [global]
>>>
>>> fsid = a45e6e54-70ef-4470-91db-2152965deec5
>>>
>>> mon_initial_members = cephtest02, cephtest03, cephtest04
>>>
>>> mon_host = 10.0.0.2,10.0.0.3,10.0.0.4
>>>
>>> osd_journal_size = 1024
>>>
>>> filestore_xattr_use_omap = true
>>>
>>> auth_cluster_required = none #cephx
>>>
>>> auth_service_required = none #cephx
>>>
>>> auth_client_required = none #cephx
>>>
>>>
>>>
>>> [client.radosgw.gateway]
>>>
>>> host = cephtest06
>>>
>>> keyring = /etc/ceph/keyring.radosgw.gateway
>>>
>>> rgw_socket_path = /tmp/radosgw.sock
>>>
>>> log_file = /var/log/ceph/radosgw.log
>>>
>>>
>>>
>>>
>>>
>>> ceph@cephclient01:/etc/ceph$ ping cephtest06
>>>
>>> PING cephtest06.jf.intel.com (10.23.37.175) 56(84) bytes of data.
>>>
>>> 64 bytes from cephtest06.jf.intel.com (10.23.37.175): icmp_req=1
>>> ttl=64
>>> time=0.216 ms
>>>
>>> 64 bytes from cephtest06.jf.intel.com (10.23.37.175): icmp_req=2
>>> ttl=64
>>> time=0.209 ms
>>>
>>> ^C
>>>
>>> --- cephtest06.jf.intel.com ping statistics ---
>>>
>>> 2 packets transmitted, 2 received, 0% packet loss, time 999ms
>>>
>>> rtt min/avg/max/mdev = 0.209/0.212/0.216/0.015 ms
>>>
>>>
>>>
>>>
>>>
>>> ceph@cephclient01:/etc/ceph$ ping cephtest06.jf.intel.com
>>>
>>> PING cephtest06.jf.intel.com (10.23.37.175) 56(84) bytes of data.
>>>
>>> 64 bytes from cephtest06.jf.intel.com (10.23.37.175): icmp_req=1
>>> ttl=64
>>> time=0.223 ms
>>>
>>> 64 bytes from cephtest06.jf.intel.com (10.23.37.175): icmp_req=2
>>> ttl=64
>>> time=0.242 ms
>>>
>>> ^C
>>>
>>> --- cephtest06.jf.intel.com ping statistics ---
>>>
>>> 2 packets transmitted, 2 received, 0% packet loss, time 999ms
>>>
>>> rtt min/avg/max/mdev = 0.223/0.232/0.242/0.017 ms
>>>
>>>
>>>
>>>
>>>
>>> I did try putting the client on the 10.0.0.x network to see if that
>>> would affect behavior but that just seemed to introduce a new problem:
>>>
>>>
>>>
>>> ceph@cephclient01:/etc/ceph$ rados df
>>>
>>> 2013-10-01 21:37:29.439410 7f60d2a43700 failed to decode message of
>>> type 59
>>> v1: buffer::end_of_buffer
>>>
>>> 2013-10-01 21:37:29.439583 7f60d4a47700 monclient: hunting for new mon
>>>
>>>
>>>
>>> ceph@cephclient01:/etc/ceph$ ceph -m 10.0.0.2 -s
>>>
>>> 2013-10-01 21:37:42.341480 7f61eacd5700 monclient: hunting for new mon
>>>
>>> 2013-10-01 21:37:45.341024 7f61eacd5700 monclient: hunting for new mon
>>>
>>> 2013-10-01 21:37:45.343274 7f61eacd5700 monclient: hunting for new mon
>>>
>>>
>>>
>>> ceph@cephclient01:/etc/ceph$ ceph health
>>>
>>> 2013-10-01 21:39:52.833560 mon <- [health]
>>>
>>> 2013-10-01 21:39:52.834671 mon.0 -> 'unparseable JSON health' (-22)
>>>
>>> ceph@cephclient01:/etc/ceph$
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux