Thanks for responding, and I apologize for taking so long to get back to
you. But all the osds are
ceph version 10.2.2
The osd with the bad disk is down and out of the cluster. The disk may
get replaced today. I'm still getting blocked request messages, but at
a significantly lower rate. Most of the current messages are associated
with two hosts.
Jon
On 9/19/2016 7:45 PM, Will.Boege wrote:
Sorry make that 'ceph tell osd.* version'
On Sep 19, 2016, at 2:55 PM, WRIGHT, JON R (JON R) <jonrodwright@xxxxxxxxx> wrote:
When you say client, we're actually doing everything through Openstack vms and cinder block devices.
librbd and librados are:
/usr/lib/librbd.so.1.0.0
/usr/lib/librados.so.2
But I think this problem may have been related to a disk going back. We got Disk I/O errors over the weekend and are replacing a disk, and I think the blocked requests may have all been associated with PGs that included the bad OSD/disk.
Would this make sense?
Jon
On 9/15/2016 3:49 AM, Wido den Hollander wrote:
Op 13 september 2016 om 18:54 schreef "WRIGHT, JON R (JON R)" <jonrodwright@xxxxxxxxx>:
VM Client OS: ubuntu 14.04
Openstack: kilo
libvirt: 1.2.12
nova-compute-kvm: 1:2015.1.4-0ubuntu2
What librados/librbd version are you running on the client?
Wido
Jon
On 9/13/2016 11:17 AM, Wido den Hollander wrote:
Op 13 september 2016 om 15:58 schreef "WRIGHT, JON R (JON R)" <jonrodwright@xxxxxxxxx>:
Yes, I do have old clients running. The clients are all vms. Is it
typical that vm clients have to be rebuilt after a ceph upgrade?
No, not always, but it is just that I saw this happening recently after a Jewel upgrade.
What version are the client(s) still running?
Wido
Thanks,
Jon
On 9/12/2016 4:05 PM, Wido den Hollander wrote:
Op 12 september 2016 om 18:47 schreef "WRIGHT, JON R (JON R)" <jonrodwright@xxxxxxxxx>:
Since upgrading to Jewel from Hammer, we're started to see HEALTH_WARN
because of 'blocked requests > 32 sec'. Seems to be related to writes.
Has anyone else seen this? Or can anyone suggest what the problem might be?
Do you by any chance have old clients connecting? I saw this after a Jewel upgrade as well and it was because of very old clients still connecting to the cluster.
Wido
Thanks!
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com