Turns out to be an authentication problem .. I recreated the keyring file again, and re-added the RGW to the cluster, as follows:
ceph-authtool --create-keyring ceph.client.radosgw.keyring
ceph-authtool ceph.client.radosgw.keyring -n client.radosgw.gateway --gen-key
ceph-authtool -n client.radosgw.gateway --cap osd 'allow rwx' --cap mon 'allow rwx' ceph.client.radosgw.keyring
ceph -k ceph.client.admin.keyring auth add client.radosgw.gateway -i ceph.client.radosgw.keyring
Now, I can run RGW as follows (must use -n option, or we will see all sorts of errors): sudo radosgw -c ceph.conf -n client.radosgw.gateway -d
And, since the usual case we like to start RGW using the init.d script command, we will have to have our configuration file in the default expected location “/etc/ceph”, that’s in case we didn’t it from the very begging, In my case I didn’t, so I had to suffer a little, since it was my first experience to install RGW and add it to the cluster.
Now we can run it like: sudo service radosgw start — or — sudo /etc/init.d/radosgw start
And everything should work ..
Thanks Yehuda for your support ..
Beanos!
Hello Yehuda,
I can see one problem only in the above output: -1 Couldn't init storage provider (RADOS) .. please check the output, probably you can find something useful
add the '-n client.radosgw.gateway' param when you're running the gateway, all your settings are under that user. Yehuda ----- Original Message ----- From: "B L" <super.iterator@xxxxxxxxx> To: "Yehuda Sadeh-Weinraub" <yehuda@xxxxxxxxxx> Cc: ceph-users@xxxxxxxxxxxxxx Sent: Saturday, February 14, 2015 2:56:54 PM Subject: Re: Having problem to start Radosgw
Yehuda ..
In case you will need to know more about my system
Here is my full cluster configuration: https://gist.github.com/anonymous/fb4c314320d7df75569a
And, that’s my ceph cluster status:
$ ceph -s
cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 203 pgs degraded; 203 pgs stuck unclean; recovery 6/151 objects degraded (3.974%) monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e93: 6 osds: 6 up, 6 in pgmap v3676: 1920 pgs, 16 pools, 10241 kB data, 51 objects 279 MB used, 18086 MB / 18365 MB avail 6/151 objects degraded (3.974%) 203 active+degraded 1717 active+clean
It was fully healthy before adding the radosgw pools .. yet, I still can put objects to the cluster (without using RGW)
Best!
On Feb 15, 2015, at 12:39 AM, B L < super.iterator@xxxxxxxxx > wrote:
That’s what I usually do to check if rgw is running with no problems: sudo radosgw -c ceph.conf -d
I already pumped up the log level, but I can’t see any change or verbosity level increase of the logs, I still get the same:
2015-02-14 22:27:57.513151 7f26c79d27c0 0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 7924 2015-02-14 22:27:57.573564 7f26c79d27c0 0 framework: fastcgi 2015-02-14 22:27:57.573569 7f26c79d27c0 0 starting handler: fastcgi 2015-02-14 22:27:57.575349 7f269affd700 0 ERROR: FCGX_Accept_r returned -9 2015-02-14 22:27:57.670610 7f269bfff700 0 ERROR: can't read user header: ret=-2 2015-02-14 22:27:57.670613 7f269bfff700 0 ERROR: sync_user() failed, user=cephtest ret=-2 2015-02-14 22:27:57.671382 7f269bfff700 0 ERROR: can't read user header: ret=-2 2015-02-14 22:27:57.671384 7f269bfff700 0 ERROR: sync_user() failed, user=cephtestss ret=-2 ^C2015-02-14 22:28:30.693140 7f269b7fe700 1 handle_sigterm 2015-02-14 22:28:30.693170 7f269b7fe700 1 handle_sigterm set alarm for 120 2015-02-14 22:28:30.693179 7f26c79d27c0 -1 shutting down 2015-02-14 22:28:30.717340 7f26c79d27c0 1 final shutdown
Please let me know if I can do something more ..
Now I have 2 questions: 1- what RADOS user you refer to? 2- How would I know that I use wrong cephx keys unless I see authentication error or relevant warning?
Thanks! Beanos
On Feb 14, 2015, at 11:29 PM, Yehuda Sadeh-Weinraub < yehuda@xxxxxxxxxx > wrote:
From: "B L" < super.iterator@xxxxxxxxx > To: "Yehuda Sadeh-Weinraub" < yehuda@xxxxxxxxxx > Cc: ceph-users@xxxxxxxxxxxxxx Sent: Saturday, February 14, 2015 11:03:42 AM Subject: Re: Having problem to start Radosgw
Hello Yehyda,
The strace command you referred to me, shows this: https://gist.github.com/anonymous/8e9f1ced485996a263bb
Additionally, I traced this log file: /var/log/radosgw/ceph-client.radosgw.gateway
it has the following:
2015-02-12 18:23:32.247679 7fecca5257c0 -1 did not load config file, using default settings. 2015-02-12 18:23:32.247745 7fecca5257c0 0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20477 2015-02-12 18:23:32.251192 7fecca5257c0 -1 Couldn't init storage provider (RADOS) 2015-02-12 18:23:58.494026 7faab31377c0 -1 did not load config file, using default settings. 2015-02-12 18:23:58.494092 7faab31377c0 0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20509 2015-02-12 18:23:58.497420 7faab31377c0 -1 Couldn't init storage provider (RADOS) 2015-02-14 17:13:03.478688 7f86f09567c0 -1 did not load config file, using default settings. 2015-02-14 17:13:03.478778 7f86f09567c0 0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 2989 2015-02-14 17:13:03.482850 7f86f09567c0 -1 Couldn't init storage provider (RADOS) 2015-02-14 17:13:29.477530 7ff18226a7c0 -1 did not load config file, using default settings. 2015-02-14 17:13:29.477595 7ff18226a7c0 0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3033 2015-02-14 17:13:29.481173 7ff18226a7c0 -1 Couldn't init storage provider (RADOS) 2015-02-14 17:21:00.950847 7ffee3a3b7c0 -1 did not load config file, using default settings. 2015-02-14 17:21:00.950916 7ffee3a3b7c0 0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3086 2015-02-14 17:21:00.954085 7ffee3a3b7c0 -1 Couldn't init storage provider (RADOS)
Turns out to be that the last line of the logs is thrown out by this piece of code in rgw_main.cc:
… …
FCGX_Init();
RGWStoreManager store_manager;
if (!store_manager.init("rados", g_ceph_context)) { derr << "Couldn't init storage provider (RADOS)" << dendl; return EIO; }
RGWProcess process(g_ceph_context, 20);
process.run();
return 0;
N.B. you can find it in:( http://workbench.dachary.org/ceph/ceph/raw/8d63e140777bbdd061baa6845d57e6c3cc771f76/src/rgw/rgw_main.cc ) , 10th line from below.
Is that by any means related to the problem?
Not related. This actually means that it couldn't connect to the rados backend, so there's a different issue now. The strace log doesn't provide much with regard to the original issue as it didn't get to that part now. You can try bumping up the debug level (debug rgw = 20, debug ms = 1). I assume that the issue that you're seeing is that the wrong rados user and/or wrong cephx keys are being used. Try to run it again as you do usually, and see what the regular params that are being passed when starting radosgw; use these when running the strace command.
Yehuda
On Feb 14, 2015, at 7:24 PM, Yehuda Sadeh-Weinraub < yehuda@xxxxxxxxxx > wrote:
sudo strace -F -T -tt -o/tmp/strace.out radosgw -c ceph.conf -f
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
|