Re: Scrubbing a lot

German Anders <ganders@xxxxxxxxxxxx> · Tue, 29 Mar 2016 17:19:03 -0300

I've just upgrade to jewel, and the scrubbing seems to been corrected... but now I'm not able to map an rbd on a host (before I was able to), basically I'm getting this error msg:

rbd: sysfs write failed
rbd: map failed: (5) Input/output error

# rbd --cluster cephIB create host01 --size 102400 --pool cinder-volumes -k /etc/ceph/cephIB.client.cinder.keyring
# rbd --cluster cephIB map host01 --pool cinder-volumes -k /etc/ceph/cephIB.client.cinder.keyring
rbd: sysfs write failed
rbd: map failed: (5) Input/output error

Any ideas? on the /etc/ceph directory on the host I've:

-rw-r--r-- 1 ceph ceph  92 Nov 17 15:45 rbdmap
-rw-r--r-- 1 ceph ceph 170 Dec 15 14:47 secret.xml
-rw-r--r-- 1 ceph ceph  37 Dec 15 15:12 virsh-secret
-rw-r--r-- 1 ceph ceph   0 Dec 15 15:12 virsh-secret-set
-rw-r--r-- 1 ceph ceph  37 Dec 21 14:53 virsh-secretIB
-rw-r--r-- 1 ceph ceph   0 Dec 21 14:53 virsh-secret-setIB
-rw-r--r-- 1 ceph ceph 173 Dec 22 13:34 secretIB.xml
-rw-r--r-- 1 ceph ceph 619 Dec 22 13:38 ceph.conf
-rw-r--r-- 1 ceph ceph  72 Dec 23 09:51 ceph.client.cinder.keyring
-rw-r--r-- 1 ceph ceph  63 Mar 28 09:03 cephIB.client.cinder.keyring
-rw-r--r-- 1 ceph ceph 526 Mar 28 12:06 cephIB.conf
-rw------- 1 ceph ceph  63 Mar 29 16:11 cephIB.client.admin.keyring

Thanks in advance,

Best,

German

2016-03-29 14:45 GMT-03:00 German Anders <ganders@xxxxxxxxxxxx>:
Sure, also the scrubbing is happening on all the osds :S

# ceph --cluster cephIB daemon osd.4 config diff
{
    "diff": {
        "current": {
            "admin_socket": "\/var\/run\/ceph\/cephIB-osd.4.asok",
            "auth_client_required": "cephx",
            "filestore_fd_cache_size": "10240",
            "filestore_journal_writeahead": "true",
            "filestore_max_sync_interval": "10",
            "filestore_merge_threshold": "40",
            "filestore_op_threads": "20",
            "filestore_queue_max_ops": "100000",
            "filestore_split_multiple": "8",
            "fsid": "a4bce51b-4d6b-4394-9737-3e4d9f5efed2",
            "internal_safe_to_start_threads": "true",
            "keyring": "\/var\/lib\/ceph\/osd\/cephIB-4\/keyring",
            "leveldb_log": "",
            "log_file": "\/var\/log\/ceph\/cephIB-osd.4.log",
            "log_to_stderr": "false",
            "mds_data": "\/var\/lib\/ceph\/mds\/cephIB-4",
            "mon_cluster_log_file": "default=\/var\/log\/ceph\/cephIB.$channel.log cluster=\/var\/log\/ceph\/cephIB.log",
            "mon_data": "\/var\/lib\/ceph\/mon\/cephIB-4",
            "mon_debug_dump_location": "\/var\/log\/ceph\/cephIB-osd.4.tdump",
            "mon_host": "172.23.16.1,172.23.16.2,172.23.16.3",
            "mon_initial_members": "cibm01, cibm02, cibm03",
            "osd_data": "\/var\/lib\/ceph\/osd\/cephIB-4",
            "osd_journal": "\/var\/lib\/ceph\/osd\/cephIB-4\/journal",
            "osd_op_threads": "8",
            "rgw_data": "\/var\/lib\/ceph\/radosgw\/cephIB-4",
            "setgroup": "ceph",
            "setuser": "ceph"
        },
        "defaults": {
            "admin_socket": "\/var\/run\/ceph\/ceph-osd.4.asok",
            "auth_client_required": "cephx, none",
            "filestore_fd_cache_size": "128",
            "filestore_journal_writeahead": "false",
            "filestore_max_sync_interval": "5",
            "filestore_merge_threshold": "10",
            "filestore_op_threads": "2",
            "filestore_queue_max_ops": "50",
            "filestore_split_multiple": "2",
            "fsid": "00000000-0000-0000-0000-000000000000",
            "internal_safe_to_start_threads": "false",
            "keyring": "\/etc\/ceph\/ceph.osd.4.keyring,\/etc\/ceph\/ceph.keyring,\/etc\/ceph\/keyring,\/etc\/ceph\/keyring.bin",
            "leveldb_log": "\/dev\/null",
            "log_file": "\/var\/log\/ceph\/ceph-osd.4.log",
            "log_to_stderr": "true",
            "mds_data": "\/var\/lib\/ceph\/mds\/ceph-4",
            "mon_cluster_log_file": "default=\/var\/log\/ceph\/ceph.$channel.log cluster=\/var\/log\/ceph\/ceph.log",
            "mon_data": "\/var\/lib\/ceph\/mon\/ceph-4",
            "mon_debug_dump_location": "\/var\/log\/ceph\/ceph-osd.4.tdump",
            "mon_host": "",
            "mon_initial_members": "",
            "osd_data": "\/var\/lib\/ceph\/osd\/ceph-4",
            "osd_journal": "\/var\/lib\/ceph\/osd\/ceph-4\/journal",
            "osd_op_threads": "2",
            "rgw_data": "\/var\/lib\/ceph\/radosgw\/ceph-4",
            "setgroup": "",
            "setuser": ""
        }
    },
    "unknown": []
}

Thanks a lot!

Best,

German

2016-03-29 14:10 GMT-03:00 Samuel Just <sjust@xxxxxxxxxx>:
That seems to be scrubbing pretty often.  Can you attach a config diff

from osd.4 (ceph daemon osd.4 config diff)?

-Sam

On Tue, Mar 29, 2016 at 9:30 AM, German Anders <ganders@xxxxxxxxxxxx> wrote:

> Hi All,

>

> I've maybe a simple question, I've setup a new cluster with Infernalis

> release, there's no IO going on at the cluster level and I'm receiving a lot

> of these messages:

>

> 2016-03-29 12:22:07.462818 mon.0 [INF] pgmap v158062: 8192 pgs: 8192

> active+clean; 20617 MB data, 46164 MB used, 52484 GB / 52529 GB avail

> 2016-03-29 12:22:08.176684 osd.13 [INF] 0.d38 scrub starts

> 2016-03-29 12:22:08.179841 osd.13 [INF] 0.d38 scrub ok

> 2016-03-29 12:21:59.526355 osd.9 [INF] 0.8a6 scrub starts

> 2016-03-29 12:21:59.529582 osd.9 [INF] 0.8a6 scrub ok

> 2016-03-29 12:22:03.004107 osd.4 [INF] 0.38b scrub starts

> 2016-03-29 12:22:03.007220 osd.4 [INF] 0.38b scrub ok

> 2016-03-29 12:22:03.617706 osd.21 [INF] 0.525 scrub starts

> 2016-03-29 12:22:03.621073 osd.21 [INF] 0.525 scrub ok

> 2016-03-29 12:22:06.527264 osd.9 [INF] 0.8a6 scrub starts

> 2016-03-29 12:22:06.529150 osd.9 [INF] 0.8a6 scrub ok

> 2016-03-29 12:22:07.005628 osd.4 [INF] 0.38b scrub starts

> 2016-03-29 12:22:07.009776 osd.4 [INF] 0.38b scrub ok

> 2016-03-29 12:22:07.618191 osd.21 [INF] 0.525 scrub starts

> 2016-03-29 12:22:07.621363 osd.21 [INF] 0.525 scrub ok

>

>

> I mean, all the time, and AFAIK these is because the scrub operation is like

> an fsck on the object level, so this make me think that it's not a normal

> situation. Is there any command that I can run in order to check this?

>

> # ceph --cluster cephIB health detail

> HEALTH_OK

>

>

> Thanks in advance,

>

> Best,

>

> German

>

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com