Hi Erwin,
you might want to increase OSD logging level to see what's happening.
I would suggest set debug-bdev, debug-bluefs and debug-bluestore to 10
(or even 20).
But be cautious - this can result in a huge log...
Thanks,
Igor
On 10/16/2024 3:01 PM, Erwin Bogaard wrote:
Hi,
we're experiencing issues with a few osd's. They had a crash, but now won't
start anymore. Nothing seems wrong with them, but they keep hanging with
apparent 100% i/o wait on the machine when starting the osd.
This is on ceph 18.2.4
This is the log (edited a bit, as it's too long):
2024-10-16T13:48:05.075+0200 7fc771a10640 0 set uid:gid to 167:167
(ceph:ceph)
2024-10-16T13:48:05.075+0200 7fc771a10640 0 ceph version 18.2.4
(e7ad5345525c7aa95470c26863873b581076945d) reef (stable), process ceph-osd,
pid 1223
2024-10-16T13:48:05.076+0200 7fc771a10640 0 pidfile_write: ignore empty
--pid-file
2024-10-16T13:48:05.086+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open path /var/lib/ceph/osd/ceph-3/block
2024-10-16T13:48:05.086+0200 7fc771a10640 0 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) ioctl(F_SET_FILE_RW_HINT) on
/var/lib/ceph/osd/ceph-3/block failed: (22) Invalid argument
2024-10-16T13:48:05.087+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open size 1999995142144 (0x1d1a9000000, 1.8
TiB) block_size 4096 (4 KiB) rotational device, discard not supported
2024-10-16T13:48:05.088+0200 7fc771a10640 1
bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes cache_size 1073741824
meta 0.45 kv 0.45 data 0.06
2024-10-16T13:48:05.088+0200 7fc771a10640 1 bdev(0x5630304bf180
/var/lib/ceph/osd/ceph-3/block) open path /var/lib/ceph/osd/ceph-3/block
2024-10-16T13:48:05.088+0200 7fc771a10640 0 bdev(0x5630304bf180
/var/lib/ceph/osd/ceph-3/block) ioctl(F_SET_FILE_RW_HINT) on
/var/lib/ceph/osd/ceph-3/block failed: (22) Invalid argument
2024-10-16T13:48:05.089+0200 7fc771a10640 1 bdev(0x5630304bf180
/var/lib/ceph/osd/ceph-3/block) open size 1999995142144 (0x1d1a9000000, 1.8
TiB) block_size 4096 (4 KiB) rotational device, discard not supported
2024-10-16T13:48:05.089+0200 7fc771a10640 1 bluefs add_block_device bdev 1
path /var/lib/ceph/osd/ceph-3/block size 1.8 TiB
2024-10-16T13:48:05.089+0200 7fc771a10640 1 bdev(0x5630304bf180
/var/lib/ceph/osd/ceph-3/block) close
2024-10-16T13:48:05.352+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) close
2024-10-16T13:48:05.605+0200 7fc771a10640 0 starting osd.3 osd_data
/var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
2024-10-16T13:48:05.605+0200 7fc771a10640 -1 Falling back to public
interface
2024-10-16T13:48:05.611+0200 7fc771a10640 0 load: jerasure load: lrc
2024-10-16T13:48:05.612+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open path /var/lib/ceph/osd/ceph-3/block
2024-10-16T13:48:05.612+0200 7fc771a10640 0 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) ioctl(F_SET_FILE_RW_HINT) on
/var/lib/ceph/osd/ceph-3/block failed: (22) Invalid argument
2024-10-16T13:48:05.612+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open size 1999995142144 (0x1d1a9000000, 1.8
TiB) block_size 4096 (4 KiB) rotational device, discard not supported
2024-10-16T13:48:05.612+0200 7fc771a10640 1
bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes cache_size 1073741824
meta 0.45 kv 0.45 data 0.06
2024-10-16T13:48:05.612+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) close
2024-10-16T13:48:05.886+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open path /var/lib/ceph/osd/ceph-3/block
2024-10-16T13:48:05.886+0200 7fc771a10640 0 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) ioctl(F_SET_FILE_RW_HINT) on
/var/lib/ceph/osd/ceph-3/block failed: (22) Invalid argument
2024-10-16T13:48:05.887+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open size 1999995142144 (0x1d1a9000000, 1.8
TiB) block_size 4096 (4 KiB) rotational device, discard not supported
2024-10-16T13:48:05.888+0200 7fc771a10640 1
bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes cache_size 1073741824
meta 0.45 kv 0.45 data 0.06
2024-10-16T13:48:05.888+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) close
2024-10-16T13:48:06.157+0200 7fc771a10640 0 osd.3:0.OSDShard using op
scheduler ClassedOpQueueScheduler(queue=wpq, cutoff=196)
2024-10-16T13:48:06.157+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open path /var/lib/ceph/osd/ceph-3/block
2024-10-16T13:48:06.157+0200 7fc771a10640 0 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) ioctl(F_SET_FILE_RW_HINT) on
/var/lib/ceph/osd/ceph-3/block failed: (22) Invalid argument
2024-10-16T13:48:06.157+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open size 1999995142144 (0x1d1a9000000, 1.8
TiB) block_size 4096 (4 KiB) rotational device, discard not supported
2024-10-16T13:48:06.159+0200 7fc771a10640 1
bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes cache_size 1073741824
meta 0.45 kv 0.45 data 0.06
2024-10-16T13:48:06.159+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) close
2024-10-16T13:48:06.422+0200 7fc771a10640 0 osd.3:1.OSDShard using op
scheduler ClassedOpQueueScheduler(queue=wpq, cutoff=196)
2024-10-16T13:48:06.422+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open path /var/lib/ceph/osd/ceph-3/block
2024-10-16T13:48:06.422+0200 7fc771a10640 0 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) ioctl(F_SET_FILE_RW_HINT) on
/var/lib/ceph/osd/ceph-3/block failed: (22) Invalid argument
2024-10-16T13:48:06.422+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open size 1999995142144 (0x1d1a9000000, 1.8
TiB) block_size 4096 (4 KiB) rotational device, discard not supported
2024-10-16T13:48:06.424+0200 7fc771a10640 1
bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes cache_size 1073741824
meta 0.45 kv 0.45 data 0.06
2024-10-16T13:48:06.424+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) close
2024-10-16T13:48:06.684+0200 7fc771a10640 0 osd.3:2.OSDShard using op
scheduler ClassedOpQueueScheduler(queue=wpq, cutoff=196)
2024-10-16T13:48:06.684+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open path /var/lib/ceph/osd/ceph-3/block
2024-10-16T13:48:06.684+0200 7fc771a10640 0 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) ioctl(F_SET_FILE_RW_HINT) on
/var/lib/ceph/osd/ceph-3/block failed: (22) Invalid argument
2024-10-16T13:48:06.684+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open size 1999995142144 (0x1d1a9000000, 1.8
TiB) block_size 4096 (4 KiB) rotational device, discard not supported
2024-10-16T13:48:06.686+0200 7fc771a10640 1
bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes cache_size 1073741824
meta 0.45 kv 0.45 data 0.06
2024-10-16T13:48:06.687+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) close
2024-10-16T13:48:06.947+0200 7fc771a10640 0 osd.3:3.OSDShard using op
scheduler ClassedOpQueueScheduler(queue=wpq, cutoff=196)
2024-10-16T13:48:06.947+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open path /var/lib/ceph/osd/ceph-3/block
2024-10-16T13:48:06.947+0200 7fc771a10640 0 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) ioctl(F_SET_FILE_RW_HINT) on
/var/lib/ceph/osd/ceph-3/block failed: (22) Invalid argument
2024-10-16T13:48:06.948+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open size 1999995142144 (0x1d1a9000000, 1.8
TiB) block_size 4096 (4 KiB) rotational device, discard not supported
2024-10-16T13:48:06.949+0200 7fc771a10640 1
bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes cache_size 1073741824
meta 0.45 kv 0.45 data 0.06
2024-10-16T13:48:06.949+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) close
2024-10-16T13:48:07.214+0200 7fc771a10640 0 osd.3:4.OSDShard using op
scheduler ClassedOpQueueScheduler(queue=wpq, cutoff=196)
2024-10-16T13:48:07.219+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open path /var/lib/ceph/osd/ceph-3/block
2024-10-16T13:48:07.219+0200 7fc771a10640 0 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) ioctl(F_SET_FILE_RW_HINT) on
/var/lib/ceph/osd/ceph-3/block failed: (22) Invalid argument
2024-10-16T13:48:07.219+0200 7fc771a10640 1 bdev(0x5630304bee00
/var/lib/ceph/osd/ceph-3/block) open size 1999995142144 (0x1d1a9000000, 1.8
TiB) block_size 4096 (4 KiB) rotational device, discard not supported
2024-10-16T13:48:07.219+0200 7fc771a10640 1
bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes cache_size 1073741824
meta 0.45 kv 0.45 data 0.06
2024-10-16T13:48:07.219+0200 7fc771a10640 1 bdev(0x5630304bf180
/var/lib/ceph/osd/ceph-3/block) open path /var/lib/ceph/osd/ceph-3/block
2024-10-16T13:48:07.219+0200 7fc771a10640 0 bdev(0x5630304bf180
/var/lib/ceph/osd/ceph-3/block) ioctl(F_SET_FILE_RW_HINT) on
/var/lib/ceph/osd/ceph-3/block failed: (22) Invalid argument
2024-10-16T13:48:07.220+0200 7fc771a10640 1 bdev(0x5630304bf180
/var/lib/ceph/osd/ceph-3/block) open size 1999995142144 (0x1d1a9000000, 1.8
TiB) block_size 4096 (4 KiB) rotational device, discard not supported
2024-10-16T13:48:07.220+0200 7fc771a10640 1 bluefs add_block_device bdev 1
path /var/lib/ceph/osd/ceph-3/block size 1.8 TiB
2024-10-16T13:48:07.220+0200 7fc771a10640 1 bluefs mount
2024-10-16T13:48:07.221+0200 7fc771a10640 1 bluefs _init_alloc shared, id
1, capacity 0x1d1a9000000, block size 0x10000
2024-10-16T13:48:07.335+0200 7fc771a10640 1 bluefs mount shared_bdev_used
= 0
2024-10-16T13:48:07.335+0200 7fc771a10640 1
bluestore(/var/lib/ceph/osd/ceph-3) _prepare_db_environment set db_paths to
db,1899995385036 db.slow,1899995385036
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: RocksDB version: 7.9.2
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: Git sha 0
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: Compile date
2024-07-12 14:24:06
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: DB SUMMARY
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: DB Session ID:
FNW6XLKJV5A5TZMHCHLU
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: CURRENT file: CURRENT
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: IDENTITY file:
IDENTITY
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: MANIFEST file:
MANIFEST-020722 size: 59265 Bytes
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: SST files in db dir,
Total Num: 109, files: 021121.sst 021122.sst 021123.sst 021124.sst
021125.sst 021126.sst 021127.sst 021128.sst 021129.sst
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: SST files in db.slow
dir, Total Num: 0, files:
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb: Write Ahead Log file
in db.wal: 021323.log size: 15835454 ; 021325.log size: 12760055 ;
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb:
Options.error_if_exists: 0
2024-10-16T13:48:07.345+0200 7fc771a10640 4 rocksdb:
Options.create_if_missing: 0
...
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.compaction_readahead_size: 2097152
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.max_background_flushes: -1
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb: Compression
algorithms supported:
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb: kZSTD supported: 0
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb: kXpressCompression
supported: 0
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb: kBZip2Compression
supported: 0
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
kZSTDNotFinalCompression supported: 0
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb: kLZ4Compression
supported: 1
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb: kZlibCompression
supported: 1
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb: kLZ4HCCompression
supported: 1
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb: kSnappyCompression
supported: 1
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb: Fast CRC32 supported:
Supported on x86
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb: DMutex
implementation: pthread_mutex_t
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
[db/db_impl/db_impl_readonly.cc:25] Opening the db in read only mode
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
[db/version_set.cc:5527] Recovering from manifest file: db/MANIFEST-020722
2024-10-16T13:48:07.346+0200 7fc771a10640 2 rocksdb:
[db/column_family.cc:578] Failed to register data paths of column family
(id: 0, name: default)
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
[db/column_family.cc:630] --------------- Options for column family
[default]:
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.comparator: leveldb.BytewiseComparator
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.merge_operator: .T:int64_array.b:bitwise_xor
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.compaction_filter: None
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.compaction_filter_factory: None
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.sst_partitioner_factory: None
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.memtable_factory: SkipListFactory
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.table_factory: BlockBasedTable
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
table_factory options: flush_block_policy_factory:
FlushBlockBySizePolicyFactory (0x5630304b89a0)
cache_index_and_filter_blocks: 1
cache_index_and_filter_blocks_with_high_priority: 0
pin_l0_filter_and_index_blocks_in_cache: 0
pin_top_level_index_and_filter: 1
index_type: 0
data_block_index_type: 0
index_shortening: 1
data_block_hash_table_util_ratio: 0.750000
checksum: 4
no_block_cache: 0
block_cache: 0x563030498dd0
block_cache_name: BinnedLRUCache
block_cache_options:
capacity : 483183820
num_shard_bits : 4
strict_capacity_limit : 0
high_pri_pool_ratio: 0.000
block_cache_compressed: (nil)
persistent_cache: (nil)
block_size: 4096
block_size_deviation: 10
block_restart_interval: 16
index_block_restart_interval: 1
metadata_block_size: 4096
partition_filters: 0
use_delta_encoding: 1
filter_policy: bloomfilter
whole_key_filtering: 1
verify_compression: 0
read_amp_bytes_per_bit: 0
format_version: 5
enable_index_compression: 1
block_align: 0
max_auto_readahead_size: 262144
prepopulate_block_cache: 0
initial_auto_readahead_size: 8192
num_file_reads_for_auto_readahead: 2
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.write_buffer_size: 16777216
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.max_write_buffer_number: 64
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.compression: LZ4
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.bottommost_compression: Disabled
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.prefix_extractor: nullptr
...
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.blob_garbage_collection_age_cutoff: 0.250000
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.blob_garbage_collection_force_threshold: 1.000000
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.blob_compaction_readahead_size: 0
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.blob_file_starting_level: 0
2024-10-16T13:48:07.346+0200 7fc771a10640 4 rocksdb:
Options.experimental_mempurge_threshold: 0.000000
Then is hangs here seemingly indefinitely.
Hope someone can shed some light!
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
--
Igor Fedotov
Ceph Lead Developer
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx