On our Ubuntu 14.04/Firefly 0.80.8 cluster we are seeing
problem with log file rotation for the rados gateway.
The /etc/logrotate.d/radosgw script gets called, but
it does not work correctly. It spits out this message,
coming from the postrotate portion:
/etc/cron.daily/logrotate:
reload: Unknown parameter: id
invoke-rc.d: initscript radosgw, action "reload" failed.
A new log file actually gets created, but due to the
failure in the post-rotate script, the daemon actually
continues writing into the now deleted previous file:
[B|root@node01] /etc/init ➜ ps aux | grep radosgw
root 13077 0.9 0.1 13710396 203256 ? Ssl Feb14 212:27
/usr/bin/radosgw -n client.radosgw.node01
[B|root@node01] /etc/init ➜ ls -l /proc/13077/fd/
total 0
lr-x------ 1 root root 64 Mar 2 15:53 0 -> /dev/null
lr-x------ 1 root root 64 Mar 2 15:53 1 -> /dev/null
lr-x------ 1 root root 64 Mar 2 15:53 2 -> /dev/null
l-wx------ 1 root root 64 Mar 2 15:53 3 ->
/var/log/radosgw/radosgw.log.1 (deleted)
...
Trying manually with service radosgw reload fails with
the same message. Running the non-upstart
/etc/init.d/radosgw reload works. It will, kind of crudely,
just send a SIGHUP to any running radosgw process.
To figure out the cause I compared OSDs and RadosGW wrt
to upstart and got this:
[B|root@node01] /etc/init ➜ initctl list | grep osd
ceph-osd-all start/running
ceph-osd-all-starter stop/waiting
ceph-osd (ceph/8) start/running, process 12473
ceph-osd (ceph/9) start/running, process 12503
...
[B|root@node01] /etc/init ➜ initctl reload radosgw cluster="ceph"
id="radosgw.node01"
initctl: Unknown instance: ceph/radosgw.node01
[B|root@node01] /etc/init ➜ initctl list | grep rados
radosgw-instance stop/waiting
radosgw stop/waiting
radosgw-all-starter stop/waiting
radosgw-all start/running
Apart from me not being totally clear about what the difference
between radosgw-instance and radosgw is, obviously Upstart
has no idea about which PID to send the SIGHUP to when I ask
it to reload.
I can, of course, replace the logrotate config and use the
/etc/init.d/radosgw reload approach, but I would like to
understand if this is something unique to our system, or if
this is a bug in the scripts.
FWIW here's an excerpt from /etc/ceph.conf:
[client.radosgw.node01]
host = node01
rgw print continue = false
keyring = /etc/ceph/keyring.radosgw.gateway
rgw socket path = /tmp/radosgw.sock
log file = /var/log/radosgw/radosgw.log
rgw enable ops log = false
rgw gc max objs = 31
Thanks!
Daniel
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com