Re: Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

Steven Vacaroaia <stef97@xxxxxxxxx> · Wed, 11 Apr 2018 20:28:44 +0000

Hello again,
I have reinstalled the cluster and noticed that, with 2 servers is working as expectd, adding the 3rd one tanks perfermonce IRRESPECTIVE of which server is the 3 rd one 
I have tested it with only 1 OSD per server in order to eliminate any balancing issues

This seems to indicate an issue with ceph config ...but it is quite straight forward 

Any help will be appreciated 

mon_initial_members = mon01
mon_host = 10.10.30.191
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

public_network = 10.10.30.0/24
cluster_network = 192.168.0.0/24

osd_pool_default_size = 2
osd_pool_default_min_size = 1 # Allow writing 1 copy in a degraded state
osd_crush_chooseleaf_type = 1

debug_lockdep = 0/0
debug_context = 0/0
debug_crush = 0/0
debug_buffer = 0/0
debug_timer = 0/0
debug_filer = 0/0
debug_objecter = 0/0
debug_rados = 0/0
debug_rbd = 0/0
debug_journaler = 0/0
debug_objectcatcher = 0/0
debug_client = 0/0
debug_osd = 0/0
debug_optracker = 0/0
debug_objclass = 0/0
debug_filestore = 0/0
debug_journal = 0/0
debug_ms = 0/0
debug_monc = 0/0
debug_tp = 0/0
debug_auth = 0/0
debug_finisher = 0/0
debug_heartbeatmap = 0/0
debug_perfcounter = 0/0
debug_asok = 0/0
debug_throttle = 0/0
debug_mon = 0/0
debug_paxos = 0/0
debug_rgw = 0/0

[mon]
mon_allow_pool_delete = true
mon_osd_min_down_reporters = 1

[osd]
osd_mkfs_type = xfs
osd_mount_options_xfs = "rw,noatime,nodiratime,attr2,logbufs=8,logbsize=256k,largeio,inode64,swalloc,allocsize=4M"
osd_mkfs_options_xfs = "-f -i size=2048"
bluestore_block_db_size = 32212254720
bluestore_block_wal_size = 1073741824

On Wed, 11 Apr 2018 at 08:57, Steven Vacaroaia <stef97@xxxxxxxxx> wrote:
[root@osd01 ~]# ceph osd pool ls detail -f json-pretty

[
    {
        "pool_name": "rbd",
        "flags": 1,
        "flags_names": "hashpspool",
        "type": 1,
        "size": 2,
        "min_size": 1,
        "crush_rule": 0,
        "object_hash": 2,
        "pg_num": 128,
        "pg_placement_num": 128,
        "crash_replay_interval": 0,
        "last_change": "300",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "auid": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "snap_epoch": 0,
        "pool_snaps": [],
        "removed_snaps": "[]",
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "tiers": [],
        "tier_of": -1,
        "read_tier": -1,
        "write_tier": -1,
        "cache_mode": "none",
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_full_ratio_micro": 800000,
        "cache_min_flush_age": 0,
        "cache_min_evict_age": 0,
        "erasure_code_profile": "",
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_count": 0,
        "use_gmt_hitset": true,
        "min_read_recency_for_promote": 0,
        "min_write_recency_for_promote": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_search_last_n": 0,
        "grade_table": [],
        "stripe_width": 0,
        "expected_num_objects": 0,
        "fast_read": false,
        "options": {},
        "application_metadata": {
            "rbd": {}
        }
    }
]
[root@osd01 ~]# ceph osd crush rule dump
[
    {
        "rule_id": 0,
        "rule_name": "replicated_rule",
        "ruleset": 0,
        "type": 1,
        "min_size": 1,
        "max_size": 10,
        "steps": [
            {
                "op": "take",
                "item": -1,
                "item_name": "default"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    }
]

On Wed, 11 Apr 2018 at 08:50, Konstantin Shalygin <k0ste@xxxxxxxx> wrote:

On 04/11/2018 07:48 PM, Steven Vacaroaia wrote:

> Thanks for the suggestion but , unfortunately, having same number of 

> OSD did not solve the issue

> Here is with 2 OSD per server, 3 servers - identical servers and osd 

> configuration

ceph osd pool ls detail

ceph osd crush rule dump

k

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com