Hi Experts, After implemented Ceph initially with 3 OSDs, now I am facing an issue: It reports healthy but sometimes(or often) fails to access the pools. While sometimes it comes back to normal automatically. For example: [ceph@gcloudcon ceph-cluster]$ rados -p volumes ls 2015-03-24 11:44:17.262941 7f3d6bfff700 0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption 2015-03-24 11:44:17.262951 7f3d6bfff700 0 -- 206.12.25.25:0/1004580 >> 206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55582 s=1 pgs=0 cs=0 l=1 c=0x26d8270).failed verifying authorize reply 2015-03-24 11:44:17.262999 7f3d6bfff700 0 -- 206.12.25.25:0/1004580 >> 206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55582 s=1 pgs=0 cs=0 l=1 c=0x26d8270).fault 2015-03-24 11:44:17.263637 7f3d6bfff700 0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption 2015-03-24 11:44:17.263645 7f3d6bfff700 0 -- 206.12.25.25:0/1004580 >> 206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55583 s=1 pgs=0 cs=0 l=1 c=0x26d8270).failed verifying authorize reply 2015-03-24 11:44:17.464379 7f3d6bfff700 0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption 2015-03-24 11:44:17.464388 7f3d6bfff700 0 -- 206.12.25.25:0/1004580 >> 206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55584 s=1 pgs=0 cs=0 l=1 c=0x26d8270).failed verifying authorize reply 2015-03-24 11:44:17.865222 7f3d6bfff700 0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption 2015-03-24 11:44:17.865245 7f3d6bfff700 0 -- 206.12.25.25:0/1004580 >> 206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55585 s=1 pgs=0 cs=0 l=1 c=0x26d8270).failed verifying authorize reply 2015-03-24 11:44:18.666056 7f3d6bfff700 0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption 2015-03-24 11:44:18.666077 7f3d6bfff700 0 -- 206.12.25.25:0/1004580 >> 206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55587 s=1 pgs=0 cs=0 l=1 c=0x26d8270).failed verifying authorize reply [ceph@gcloudcon ceph-cluster]$ ceph auth list installed auth entries: mds.gcloudnet key: xxxxxxx caps: [mds] allow caps: [mon] allow profile mds caps: [osd] allow rwx osd.0 key: xxxxxxx caps: [mon] allow profile osd caps: [osd] allow * osd.1 key: xxxxxxx caps: [mon] allow profile osd caps: [osd] allow * osd.2 key: xxxxxxx caps: [mon] allow profile osd caps: [osd] allow * client.admin key: xxxxxxx caps: [mds] allow caps: [mon] allow * caps: [osd] allow * client.backups key: xxxxxxx caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=backups client.bootstrap-mds key: xxxxxxx caps: [mon] allow profile bootstrap-mds client.bootstrap-osd key: xxxxxxx caps: [mon] allow profile bootstrap-osd client.images key: xxxxxxx caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=images client.libvirt key: xxxxxxx caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=libvirt-pool client.volumes key: xxxxxxx caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rx pool=images [root@gcloudcon ~]# more /etc/ceph/ceph.conf [global] auth_service_required = cephx osd_pool_default_size = 2 filestore_xattr_use_omap = true auth_client_required = cephx auth_cluster_required = cephx mon_host = 206.12.25.26 public_network = 206.12.25.0/16 mon_initial_members = gcloudnet cluster_network = 192.168.10.0/16 fsid = xxxxxx [client.images] keyring = /etc/ceph/ceph.client.images.keyring [client.volumes] keyring = /etc/ceph/ceph.client.volumes.keyring [client.backups] keyring = /etc/ceph/ceph.client.backups.keyring [ceph@gcloudcon ceph-cluster]$ ceph -w cluster a4d0879f-abdc-4f9d-8a4b-53ce57d822f1 health HEALTH_OK monmap e1: 1 mons at {gcloudnet=206.12.25.26:6789/0}, election epoch 1, quorum 0 gcloudnet osdmap e27: 3 osds: 3 up, 3 in pgmap v1894: 704 pgs, 6 pools, 1640 MB data, 231 objects 18757 MB used, 22331 GB / 22350 GB avail 704 active+clean 2015-03-24 17:56:20.884293 mon.0 [INF] from='client.? 206.12.25.25:0/1006501' entity='client.admin' cmd=[{"prefix": "auth list"}]: dispatch Can anybody give me a hint or what I should check? Thanks, -- ------------------------------------ Erming Pei, Senior System Analyst Information Services & Technology University of Alberta, Canada Tel: 7804929914 Fax: 7804921729 ------------------------------------ |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com