On 14/02/2022 16:07, Igor Fedotov wrote:
Hi Andrej,
On 2/12/2022 9:56 AM, Andrej Filipcic wrote:
On 11/02/2022 15:22, Igor Fedotov wrote:
Hi Andrej,
you might want to set debug_bluestore and debug_bluefs to 10 and
check what's happening during the startup...
Alternatively you might try to compact slow OSD's DB using
ceph_kvstore_tool and check if it helps to speedup the startup...
with debug 10,
you set both debug bluefs and bluestore to 10, right?
yes.
We have drained already most of those OSDs and recreated them. Some are
still there and can be further debugged (the log you have asked is
mentioned below). One more thing I found strange is that even these
drained OSDs (osd out wit no pgs) are very slow to start, takes about
the same as when they were 1/2 full.
...
2022-02-12T07:50:18.672+0100 7ff0ad32b700 4 rocksdb:
[compaction/compaction_job.cc:1349] [default] [JOB 14] Generated
table #867263: 84569 keys, 45766553 bytes
2022-02-12T07:50:18.672+0100 7ff0ad32b700 4 rocksdb: EVENT_LOG_v1
{"time_micros": 1644648618672869, "cf_name": "default", "job": 14,
"event": "table_file_creation", "file_number": 867263, "file_size":
45766553, "table_properties": {"data_size": 44720602, "index_size":
833544, "index_partitions": 0, "top_level_index_size": 0,
"index_key_is_user_key": 0, "index_value_is_delta_encoded": 0,
"filter_size": 211525, "raw_key_size": 5705417,
"raw_average_key_size": 67, "raw_value_size": 42291491,
"raw_average_value_size": 500, "num_data_blocks": 11468,
"num_entries": 84569, "num_deletions": 0, "num_merge_operands": 0,
"num_range_deletions": 0, "format_version": 0, "fixed_key_len": 0,
"filter_policy": "rocksdb.BuiltinBloomFilter", "column_family_name":
"default", "column_family_id": 0, "comparator":
"leveldb.BytewiseComparator", "merge_operator":
".T:int64_array.b:bitwise_xor", "prefix_extractor_name": "nullptr",
"property_collectors": "[]", "compression": "NoCompression",
"compression_options": "window_bits=-14; level=32767; strategy=0;
max_dict_bytes=0; zstd_max_train_bytes=0; enabled=0; ",
"creation_time": 1626461453, "oldest_key_time": 0,
"file_creation_time": 1644648616}}
2022-02-12T07:50:18.693+0100 7ff0ad32b700 4 rocksdb:
[compaction/compaction_job.cc:1415] [default] [JOB 14] Compacted 1@3
+ 25@4 files to L4 => 1625871428 bytes
2022-02-12T07:50:18.827+0100 7ff0ad32b700 4 rocksdb: (Original Log
Time 2022/02/12-07:50:18.827605) [compaction/compaction_job.cc:760]
[default] compacted to: files[0 1 30 348 1853 0 0] max score 0.95,
MB/sec: 30.0 rd, 28.7 wr, level 4, files in(1, 25) out(24) MB
in(50.9, 1574.2) out(1550.6), read-write-amplify(62.3)
write-amplify(30.4) OK, records in: 3428443, records dropped: 404881
output_compression: NoCompression
2022-02-12T07:50:18.827+0100 7ff0ad32b700 4 rocksdb: (Original Log
Time 2022/02/12-07:50:18.827635) EVENT_LOG_v1 {"time_micros":
1644648618827623, "job": 14, "event": "compaction_finished",
"compaction_time_micros": 56736833, "compaction_time_cpu_micros":
4401723, "output_level": 4, "num_output_files": 24,
"total_output_size": 1625871428, "num_input_records": 3428443,
"num_output_records": 3023562, "num_subcompactions": 1,
"output_compression": "NoCompression",
"num_single_delete_mismatches": 0, "num_single_delete_fallthrough":
0, "lsm_state": [0, 1, 30, 348, 1853, 0, 0]}
2022-02-12T07:50:18.827+0100 7ff0ad32b700 4 rocksdb: EVENT_LOG_v1
{"time_micros": 1644648618828078, "job": 14, "event":
"table_file_deletion", "file_number": 867202}
2022-02-12T07:50:18.827+0100 7ff0acb2a700 4 rocksdb: (Original Log
Time 2022/02/12-07:50:18.828165)
[db_impl/db_impl_compaction_flush.cc:2611] Compaction nothing to do
the above is reached in ~4 minutes, and then the extreme reading
takes over with no additional logging. So I guess compaction is not
the issue.
Can I see the bluestore/bluefs output preceding rocksdb output, please?
I have copied the full log here, it's rather large
http://www-f9.ijs.si/~andrej/ceph-osd.0.log-20220213.xz
Best regards,
Andrej
Just in case - is bluefs_buffered_io set to true for these OSDs?
yes, that's set.
Andrej
Thanks,
Igor
On 2/11/2022 11:14 AM, Andrej Filipcic wrote:
Hi,
with 16.2.7, some OSDs are very slow to start, eg it takes ~30min
for an hdd (12TB, 5TB used) to become active. After initialization,
there is 20-40min of extreme reading at ~150MB/s from the OSD, just
after
-------
Uptime(secs): 602.2 total, 0.0 interval
Flush(GB): cumulative 0.101, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.72 GB write, 1.22 MB/s write, 0.71 GB
read, 1.21 MB/s read, 33.1 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read,
0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0
level0_slowdown_with_compaction, 0 level0_numfiles, 0
level0_numfiles_with_compaction, 0 stop for
pending_compaction_bytes, 0 slowdown for pending_compaction_bytes,
0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
** File Read Latency Histogram By Level [default] **
-------
and the log (no special debug conf) is not updated during that time.
These slow OSDs were created with nautilus, and I am not quite sure
when this slow startup appeared. the newer OSDs are fast to start.
The minimum OSD release is set to pacific.
Any hints before I start debuging the problem? The history of
installs:
2020-09-15T09:42:29Z SUBDEBUG Installed:
ceph-base-2:15.2.4-0.el8.x86_64
2020-12-23T11:02:29Z SUBDEBUG Upgrade: ceph-base-2:15.2.8-0.el8.x86_64
2021-08-31T08:46:29Z SUBDEBUG Upgrade: ceph-base-2:16.2.5-0.el8.x86_64
2022-02-09T09:38:42+0100 SUBDEBUG Upgrade:
ceph-base-2:16.2.7-0.el8.x86_64
Best regards,
Andrej
--
_____________________________________________________________
prof. dr. Andrej Filipcic, E-mail: Andrej.Filipcic@xxxxxx
Department of Experimental High Energy Physics - F9
Jozef Stefan Institute, Jamova 39, P.o.Box 3000
SI-1001 Ljubljana, Slovenia
Tel.: +386-1-477-3674 Fax: +386-1-425-7074
-------------------------------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx