Re: Bluestore crashing constantly with load on newly created cluster/host.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



My host has 256GB of ram.  62GB used under most heavy io workload.
_____________________________________________

Tyler Bishop
EST 2007


O: 513-299-7108 x1000
M: 513-646-5809


This email is intended only for the recipient(s) above and/or otherwise authorized personnel. The information contained herein and attached is confidential and the property of Beyond Hosting. Any unauthorized copying, forwarding, printing, and/or disclosing any information related to this email is prohibited. If you received this message in error, please contact the sender and destroy all copies of this email and any attachment(s).


On Mon, Aug 27, 2018 at 10:36 PM Alfredo Daniel Rezinovsky <alfredo.rezinovsky@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:

I had blockdb in ssd, with 3 OSDs per host (8G ram) and the default 3G bluestore_cache_size_ssd

I stopped having inconsistencies dropping the cache to 1G.


On 27/08/18 23:32, Tyler Bishop wrote:
Having a constant segfault issue under io load with my newly created bluestore deployment.


Setup is 28GB SSD LVM for block.db and 6T spinner for data.  

Config:
[global]
fsid =  REDACTED
mon_initial_members = cephmon-1001, cephmon-1002, cephmon-1003
mon_host = 10.20.142.5,10.20.142.6,10.20.142.7
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true

# Fixes issue where image is created with newer than supported features enabled.
rbd_default_features = 3


# Debug Tuning
debug_lockdep = 0/0
debug_context = 0/0
debug_crush = 0/0
debug_buffer = 0/0
debug_timer = 0/0
debug_filer = 0/0
debug_objecter = 0/0
debug_rados = 0/0
debug_rbd = 0/0
debug_journaler = 0/0
debug_objectcatcher = 0/0
debug_client = 0/0
debug_osd = 0/0
debug_optracker = 0/0
debug_objclass = 0/0
debug_filestore = 0/0
debug_journal = 0/0
debug_ms = 0/0
debug_monc = 0/0
debug_tp = 0/0
debug_auth = 0/0
debug_finisher = 0/0
debug_heartbeatmap = 0/0
debug_perfcounter = 0/0
debug_asok = 0/0
debug_throttle = 0/0
debug_mon = 0/0
debug_paxos = 0/0
debug_rgw = 0/0

[osd]
osd_mkfs_type = xfs
osd_mount_options_xfs = rw,noatime,,nodiratime,inode64,logbsize=256k,delaylog
osd_mkfs_options_xfs = -f -i size=2048
osd_journal_size = 10240
filestore_queue_max_ops=1000
filestore_queue_max_bytes = 1048576000
filestore_max_sync_interval = 10
filestore_merge_threshold = 500
filestore_split_multiple = 100
osd_op_shard_threads = 6
journal_max_write_entries = 5000
journal_max_write_bytes = 1048576000
journal_queueu_max_ops = 3000
journal_queue_max_bytes = 1048576000
ms_dispatch_throttle_bytes = 1048576000
objecter_inflight_op_bytes = 1048576000
public network = 10.20.142.0/24
cluster_network = 10.20.136.0/24
osd_disk_thread_ioprio_priority = 7
osd_disk_thread_ioprio_class = idle
osd_max_backfills = 2
osd_recovery_sleep = 0.10


[client]
rbd_cache = False
rbd cache size = 33554432
rbd cache target dirty = 16777216
rbd cache max dirty = 25165824
rbd cache max dirty age = 2
rbd cache writethrough until flush = false


--------


2018-08-28 02:31:30.961954 7f64a895a700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/flush_job.cc:319] [default] [JOB 19] Level-0 flush table #688: 6121532 bytes OK
2018-08-28 02:31:30.962476 7f64a895a700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/db_impl_files.cc:242] adding log 681 to recycle list

2018-08-28 02:31:30.962495 7f64a895a700  4 rocksdb: (Original Log Time 2018/08/28-02:31:30.961973) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/memtable_list.cc:360] [default] Level-0 commit table #688 started
2018-08-28 02:31:30.962501 7f64a895a700  4 rocksdb: (Original Log Time 2018/08/28-02:31:30.962413) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/memtable_list.cc:383] [default] Level-0 commit table #688: memtable #1 done
2018-08-28 02:31:30.962505 7f64a895a700  4 rocksdb: (Original Log Time 2018/08/28-02:31:30.962432) EVENT_LOG_v1 {"time_micros": 1535423490962423, "job": 19, "event": "flush_finished", "lsm_state": [1, 4, 1, 0, 0, 0, 0], "immutable_memtables": 0}
2018-08-28 02:31:30.962509 7f64a895a700  4 rocksdb: (Original Log Time 2018/08/28-02:31:30.962458) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/db_impl_compaction_flush.cc:132] [default] Level summary: base level 1 max bytes base 268435456 files[1 4 1 0 0 0 0] max score 0.84

2018-08-28 02:31:30.962517 7f64a895a700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/db_impl_files.cc:388] [JOB 19] Try to delete WAL files size 258068015, prev total WAL file size 260608480, number of live WAL files 2.

2018-08-28 02:32:06.102335 7f64b917b700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/db_impl_write.cc:684] reusing log 681 from recycle list

2018-08-28 02:32:06.102473 7f64b917b700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/db_impl_write.cc:725] [default] New memtable created with log file: #689. Immutable memtables: 0.

2018-08-28 02:32:06.102542 7f64a895a700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/db_impl_compaction_flush.cc:49] [JOB 20] Syncing log #687
2018-08-28 02:32:06.103394 7f64a895a700  4 rocksdb: (Original Log Time 2018/08/28-02:32:06.102527) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/db_impl_compaction_flush.cc:1158] Calling FlushMemTableToOutputFile with column family [default], flush slots available 1, compaction slots allowed 1, compaction slots scheduled 1
2018-08-28 02:32:06.103407 7f64a895a700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/flush_job.cc:264] [default] [JOB 20] Flushing memtable with next log file: 689

2018-08-28 02:32:06.103435 7f64a895a700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1535423526103422, "job": 20, "event": "flush_started", "num_memtables": 1, "num_entries": 97689, "num_deletes": 21335, "memory_usage": 260069984}
2018-08-28 02:32:06.103446 7f64a895a700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/flush_job.cc:293] [default] [JOB 20] Level-0 flush table #690: started
2018-08-28 02:32:06.155755 7f64a895a700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1535423526155726, "cf_name": "default", "job": 20, "event": "table_file_creation", "file_number": 690, "file_size": 6343137, "table_properties": {"data_size": 6153638, "index_size": 65232, "filter_size": 123278, "raw_key_size": 2289031, "raw_average_key_size": 52, "raw_value_size": 5374531, "raw_average_value_size": 122, "num_data_blocks": 1047, "num_entries": 43785, "filter_policy_name": "rocksdb.BuiltinBloomFilter", "kDeletedKeys": "21429", "kMergeOperands": "220"}}
2018-08-28 02:32:06.155776 7f64a895a700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/flush_job.cc:319] [default] [JOB 20] Level-0 flush table #690: 6343137 bytes OK
2018-08-28 02:32:06.156214 7f64a895a700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/db_impl_files.cc:242] adding log 687 to recycle list

2018-08-28 02:32:06.156225 7f64a895a700  4 rocksdb: (Original Log Time 2018/08/28-02:32:06.155790) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/memtable_list.cc:360] [default] Level-0 commit table #690 started
2018-08-28 02:32:06.156229 7f64a895a700  4 rocksdb: (Original Log Time 2018/08/28-02:32:06.156164) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/memtable_list.cc:383] [default] Level-0 commit table #690: memtable #1 done
2018-08-28 02:32:06.156239 7f64a895a700  4 rocksdb: (Original Log Time 2018/08/28-02:32:06.156178) EVENT_LOG_v1 {"time_micros": 1535423526156172, "job": 20, "event": "flush_finished", "lsm_state": [2, 4, 1, 0, 0, 0, 0], "immutable_memtables": 0}
2018-08-28 02:32:06.156244 7f64a895a700  4 rocksdb: (Original Log Time 2018/08/28-02:32:06.156199) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/db_impl_compaction_flush.cc:132] [default] Level summary: base level 1 max bytes base 268435456 files[2 4 1 0 0 0 0] max score 0.84

2018-08-28 02:32:06.156252 7f64a895a700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.7/rpm/el7/BUILD/ceph-12.2.7/src/rocksdb/db/db_impl_files.cc:388] [JOB 20] Try to delete WAL files size 257866117, prev total WAL file size 259275521, number of live WAL files 2.



_____________________________________________

Tyler Bishop
EST 2007


O: 513-299-7108 x1000
M: 513-646-5809


This email is intended only for the recipient(s) above and/or otherwise authorized personnel. The information contained herein and attached is confidential and the property of Beyond Hosting. Any unauthorized copying, forwarding, printing, and/or disclosing any information related to this email is prohibited. If you received this message in error, please contact the sender and destroy all copies of this email and any attachment(s).


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Alfredo Daniel Rezinovsky
Director de Tecnologías de Información y Comunicaciones
Facultad de Ingeniería - Universidad Nacional de Cuyo
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux