You might want to try manual rocksdb
compaction using ceph-kvstore-tool..
Sent from my Huawei tablet
-------- Original Message --------
Subject: Re: 3 OSDs stopped and unable to
restart
From: Brett Chancellor
To: Igor Fedotov
CC: Ceph Users
Once backfilling finished, the cluster
was super slow, most osd's were filled with
heartbeat_map errors. When an OSD restarts it causes
a cascade of other osd's to follow suit and restart..
logs like..
-3> 2019-07-10 18:34:50.046 7f34abf5b700 -1
osd.69 1348581 get_health_metrics reporting 21 slow
ops, oldest is osd_op(client.115295041.0:17575966
15.c37fa482 15.c37fa482 (undecoded)
ack+ondisk+write+known_if_redirected e1348522)
-2> 2019-07-10 18:34:50.967 7f34acf5d700 1
heartbeat_map is_healthy 'OSD::osd_op_tp thread
0x7f3493f2b700' had timed out after 90
-1> 2019-07-10 18:34:50.967 7f34acf5d700 1
heartbeat_map is_healthy 'OSD::osd_op_tp thread
0x7f3493f2b700' had suicide timed out after 150
0> 2019-07-10 18:34:51.025 7f3493f2b700 -1
*** Caught signal (Aborted) **
in thread 7f3493f2b700 thread_name:tp_osd_tp
On Tue, Jul 9, 2019 at 1:38 PM Igor
Fedotov <ifedotov@xxxxxxx>
wrote:
This will cap single bluefs space allocation.
Currently it attempts to allocate 70Gb which
seems to overflow some 32-bit length fields.
With the adjustment such allocation should be
capped at ~700MB.
I doubt there is any relation between this
specific failure and the pool. At least at the
moment.
In short the history is: starting OSD tries to
flush bluefs data to disk, detects lack of space
and asks for more from main device - allocations
succeeds but returned extent has length field
set to 0.
On 7/9/2019 8:33 PM, Brett Chancellor wrote:
What does
bluestore_bluefs_gift_ratio do? I can't find
any documentation on it. Also do you think
this could be related to the .rgw.meta pool
having too many objects per PG? The disks that
die always seem to be backfilling a pg from
that pool, and they have ~550k objects per PG.
-Brett
On Tue, Jul 9, 2019 at 1:03 PM
Igor Fedotov <ifedotov@xxxxxxx>
wrote:
Please try to set
bluestore_bluefs_gift_ratio to 0.0002
On 7/9/2019 7:39 PM, Brett Chancellor
wrote:
Too large for pastebin..
The problem is continually crashing
new OSDs. Here is the latest one.
On Tue, Jul 9, 2019 at
11:46 AM Igor Fedotov <ifedotov@xxxxxxx>
wrote:
could you please set debug
bluestore to 20 and collect
startup log for this specific
OSD once again?
On 7/9/2019 6:29 PM, Brett
Chancellor wrote:
I restarted most
of the OSDs with the stupid
allocator (6 of them wouldn't
start unless bitmap allocator
was set), but I'm still seeing
issues with OSDs crashing.
Interestingly it seems that
the dying OSDs are always
working on a pg from the
.rgw.meta pool when they
crash.
A known issue with
Stupid allocator is
gradual write request
latency increase
(occurred within several
days after OSD restart).
Seldom observed though.
There were some posts
about that behavior in
the mail list this
year.
Thanks,
Igor.
On 7/8/2019 8:33 PM,
Brett Chancellor wrote:
I'll give
that a try. Is it
something like...
ceph tell 'osd.*'
bluestore_allocator
stupid
ceph tell
'osd.*'
bluefs_allocator
stupid
And should I
expect any issues
doing this?
On Mon,
Jul 8, 2019 at 1:04
PM Igor Fedotov <ifedotov@xxxxxxx>
wrote:
I should read
call stack more
carefully...
It's not about
lacking free
space - this is
rather the bug
from this
ticket:
You should
upgrade to
v14.2.2 (once
it's available)
or temporarily
switch to stupid
allocator as a
workaround.
Thanks,
Igor
On 7/8/2019
8:00 PM, Igor
Fedotov wrote:
Hi Brett,
looks like
BlueStore is
unable to
allocate
additional
space for
BlueFS at main
device. It's
either lacking
free space or
it's too
fragmented...
Would you
share osd log,
please?
Also please
run
"ceph-bluestore-tool
--path
<substitute
with
path-to-osd!!!>
bluefs-bdev-sizes" and share the output.
Thanks,
Igor
On 7/3/2019
9:59 PM, Brett
Chancellor
wrote:
Hi
All! Today
I've had 3
OSDs stop
themselves and
are unable to
restart, all
with the same
error. These
OSDs are all
on different
hosts. All are
running 14.2.1
I did try
the following
two commands
- ceph-kvstore-tool
bluestore-kv
/var/lib/ceph/osd/ceph-80
list > keys
2019-07-03
18:30:02.095
7fe7c1c1ef00
-1
bluestore(/var/lib/ceph/osd/ceph-80)
fsck warning:
legacy statfs
record found,
suggest to run
store repair
to get
consistent
statistic
reports
fsck success
## Error
when trying to
start one of
the OSDs
-12>
2019-07-03
18:36:57.450
7f5e42366700
-1 *** Caught
signal
(Aborted) **
in thread
7f5e42366700
thread_name:rocksdb:low0