Re: Having problem to start Radosgw

B L <super.iterator@xxxxxxxxx> · Mon, 16 Feb 2015 12:20:51 +0200

Turns out to be an authentication problem .. I recreated the keyring file again, and re-added the RGW to the cluster, as follows:

ceph-authtool --create-keyring ceph.client.radosgw.keyring
ceph-authtool ceph.client.radosgw.keyring -n client.radosgw.gateway --gen-key
ceph-authtool -n client.radosgw.gateway --cap osd 'allow rwx' --cap mon 'allow rwx' ceph.client.radosgw.keyring
ceph -k ceph.client.admin.keyring auth add client.radosgw.gateway -i ceph.client.radosgw.keyring

Now, I can run RGW as follows (must use -n option, or we will see all sorts of errors):
sudo radosgw -c ceph.conf -n client.radosgw.gateway -d

And, since the usual case we like to start RGW using the init.d script command, we will have to have our configuration file in the default expected location “/etc/ceph”, that’s in case we didn’t it from the very begging, In my case I didn’t, so I had to suffer a little, since it was my first experience to install RGW and add it to the cluster.

Now we can run it like: sudo service radosgw start — or — sudo /etc/init.d/radosgw start

And everything should work ..

Thanks Yehuda for your support .. 

Beanos!

On Feb 15, 2015, at 9:37 AM, B L <super.iterator@xxxxxxxxx> wrote:

Hello Yehuda,

this is the resulting output after adding “-n client.radosgw.gateway” : https://gist.github.com/anonymous/f16701d6cacc8911620f

I can see one problem only in the above output: -1 Couldn't init storage provider (RADOS) .. please check the output, probably you can find something useful

On Feb 15, 2015, at 1:28 AM, Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx> wrote:

add the '-n client.radosgw.gateway' param when you're running the gateway, all your settings are under that user.

Yehuda

----- Original Message -----
From: "B L" <super.iterator@xxxxxxxxx>
To: "Yehuda Sadeh-Weinraub" <yehuda@xxxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx
Sent: Saturday, February 14, 2015 2:56:54 PM
Subject: Re:  Having problem to start Radosgw

Yehuda ..

In case you will need to know more about my system

Here is my full cluster configuration:
https://gist.github.com/anonymous/fb4c314320d7df75569a

And, that’s my ceph cluster status:

$ ceph -s

cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
health HEALTH_WARN 203 pgs degraded; 203 pgs stuck unclean; recovery 6/151
objects degraded (3.974%)
monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2,
quorum 0 ceph-node1
osdmap e93: 6 osds: 6 up, 6 in
pgmap v3676: 1920 pgs, 16 pools, 10241 kB data, 51 objects
279 MB used, 18086 MB / 18365 MB avail
6/151 objects degraded (3.974%)
203 active+degraded
1717 active+clean

It was fully healthy before adding the radosgw pools .. yet, I still can put
objects to the cluster (without using RGW)

Best!

On Feb 15, 2015, at 12:39 AM, B L < super.iterator@xxxxxxxxx > wrote:

That’s what I usually do to check if rgw is running with no problems: sudo
radosgw -c ceph.conf -d

I already pumped up the log level, but I can’t see any change or verbosity
level increase of the logs, I still get the same:

2015-02-14 22:27:57.513151 7f26c79d27c0 0 ceph version 0.80.7
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 7924
2015-02-14 22:27:57.573564 7f26c79d27c0 0 framework: fastcgi
2015-02-14 22:27:57.573569 7f26c79d27c0 0 starting handler: fastcgi
2015-02-14 22:27:57.575349 7f269affd700 0 ERROR: FCGX_Accept_r returned -9
2015-02-14 22:27:57.670610 7f269bfff700 0 ERROR: can't read user header:
ret=-2
2015-02-14 22:27:57.670613 7f269bfff700 0 ERROR: sync_user() failed,
user=cephtest ret=-2
2015-02-14 22:27:57.671382 7f269bfff700 0 ERROR: can't read user header:
ret=-2
2015-02-14 22:27:57.671384 7f269bfff700 0 ERROR: sync_user() failed,
user=cephtestss ret=-2
^C2015-02-14 22:28:30.693140 7f269b7fe700 1 handle_sigterm
2015-02-14 22:28:30.693170 7f269b7fe700 1 handle_sigterm set alarm for 120
2015-02-14 22:28:30.693179 7f26c79d27c0 -1 shutting down
2015-02-14 22:28:30.717340 7f26c79d27c0 1 final shutdown

Please let me know if I can do something more ..

Now I have 2 questions:
1- what RADOS user you refer to?
2- How would I know that I use wrong cephx keys unless I see authentication
error or relevant warning?

Thanks!
Beanos

On Feb 14, 2015, at 11:29 PM, Yehuda Sadeh-Weinraub < yehuda@xxxxxxxxxx >
wrote:

From: "B L" < super.iterator@xxxxxxxxx >
To: "Yehuda Sadeh-Weinraub" < yehuda@xxxxxxxxxx >
Cc: ceph-users@xxxxxxxxxxxxxx
Sent: Saturday, February 14, 2015 11:03:42 AM
Subject: Re:  Having problem to start Radosgw

Hello Yehyda,

The strace command you referred to me, shows this:
https://gist.github.com/anonymous/8e9f1ced485996a263bb

Additionally, I traced this log file:
/var/log/radosgw/ceph-client.radosgw.gateway

it has the following:

2015-02-12 18:23:32.247679 7fecca5257c0 -1 did not load config file, using
default settings.
2015-02-12 18:23:32.247745 7fecca5257c0 0 ceph version 0.80.7
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20477
2015-02-12 18:23:32.251192 7fecca5257c0 -1 Couldn't init storage provider
(RADOS)
2015-02-12 18:23:58.494026 7faab31377c0 -1 did not load config file, using
default settings.
2015-02-12 18:23:58.494092 7faab31377c0 0 ceph version 0.80.7
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20509
2015-02-12 18:23:58.497420 7faab31377c0 -1 Couldn't init storage provider
(RADOS)
2015-02-14 17:13:03.478688 7f86f09567c0 -1 did not load config file, using
default settings.
2015-02-14 17:13:03.478778 7f86f09567c0 0 ceph version 0.80.7
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 2989
2015-02-14 17:13:03.482850 7f86f09567c0 -1 Couldn't init storage provider
(RADOS)
2015-02-14 17:13:29.477530 7ff18226a7c0 -1 did not load config file, using
default settings.
2015-02-14 17:13:29.477595 7ff18226a7c0 0 ceph version 0.80.7
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3033
2015-02-14 17:13:29.481173 7ff18226a7c0 -1 Couldn't init storage provider
(RADOS)
2015-02-14 17:21:00.950847 7ffee3a3b7c0 -1 did not load config file, using
default settings.
2015-02-14 17:21:00.950916 7ffee3a3b7c0 0 ceph version 0.80.7
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3086
2015-02-14 17:21:00.954085 7ffee3a3b7c0 -1 Couldn't init storage provider
(RADOS)

Turns out to be that the last line of the logs is thrown out by this piece of
code in rgw_main.cc:

…
…

FCGX_Init();

  RGWStoreManager store_manager;

  if (!store_manager.init("rados", g_ceph_context)) {
    derr << "Couldn't init storage provider (RADOS)" << dendl;
    return EIO;
  }

  RGWProcess process(g_ceph_context, 20);

  process.run();

  return 0;

N.B. you can find it in:(
http://workbench.dachary.org/ceph/ceph/raw/8d63e140777bbdd061baa6845d57e6c3cc771f76/src/rgw/rgw_main.cc
) , 10th line from below.

Is that by any means related to the problem?

Not related. This actually means that it couldn't connect to the rados
backend, so there's a different issue now. The strace log doesn't provide
much with regard to the original issue as it didn't get to that part now.
You can try bumping up the debug level (debug rgw = 20, debug ms = 1). I
assume that the issue that you're seeing is that the wrong rados user and/or
wrong cephx keys are being used. Try to run it again as you do usually, and
see what the regular params that are being passed when starting radosgw; use
these when running the strace command.

Yehuda

On Feb 14, 2015, at 7:24 PM, Yehuda Sadeh-Weinraub < yehuda@xxxxxxxxxx >
wrote:

sudo strace -F -T -tt -o/tmp/strace.out radosgw -c ceph.conf -f

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com