Re: Bluefs spillover

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ruben,

given the recorded maximums for low DB levels (roughly 47 GB) it look like OSD DBs took much more space in the past. Perhaps you made some bulky data removals recently. Well - compaction could cause such a drop  as well.

Anyway this prevents BlueStore from using extra DB space for high (aka SLOW) levels - they land on main(slow) device.

So you have two options:

1) Expand DB volume to match your potential metadata sizes.

2) If you expect no metadata size growth - migrate data from slow device as described under the link I shared. As migration is an offline process this would restart OSDs and hence resets recorded maximums. After that Bluestore likely to be able to use extra DB space for new data (up to some degree, of cause).


Thanks,

Igor


On 8/26/2024 12:37 PM, Ruben Bosch wrote:
Hi Igor,

Thank you for your fast reply. I'll look into the provided URL, thanks.
Please see below:

for i in osd.17 osd.37 osd.89 osd.91 osd.95 osd.106; do ceph tell $i bluefs
stats; done
1 : device size 0xc7fffe000 : using 0x423c00000(17 GiB)
2 : device size 0x9187fc00000 : using 0x5bb0bab0000(5.7 TiB)
RocksDBBlueFSVolumeSelector Usage Matrix:
DEV/LEV     WAL         DB          SLOW        *           *
REAL        FILES
LOG         0 B         18 MiB      0 B         0 B         0 B         15
MiB      1
WAL         0 B         36 MiB      0 B         0 B         0 B         28
MiB      2
DB          0 B         17 GiB      0 B         0 B         0 B         13
GiB      211
SLOW        0 B         0 B         70 MiB      0 B         0 B         62
MiB      1
TOTAL       0 B         17 GiB      70 MiB      0 B         0 B         0 B
         215
MAXIMUMS:
LOG         0 B         22 MiB      0 B         0 B         0 B         18
MiB
WAL         0 B         126 MiB     0 B         0 B         0 B         92
MiB
DB          0 B         47 GiB      986 MiB     0 B         0 B         17
GiB
SLOW        0 B         3.0 GiB     352 MiB     0 B         0 B         2.4
GiB
TOTAL       0 B         50 GiB      1.3 GiB     0 B         0 B         0 B
SIZE <<  0 B         48 GiB      8.6 TiB
1 : device size 0xc7fffe000 : using 0x434200000(17 GiB)
2 : device size 0x9187fc00000 : using 0x5fd47880000(6.0 TiB)
RocksDBBlueFSVolumeSelector Usage Matrix:
DEV/LEV     WAL         DB          SLOW        *           *
REAL        FILES
LOG         0 B         14 MiB      0 B         0 B         0 B         9.4
MiB     1
WAL         0 B         18 MiB      0 B         0 B         0 B         11
MiB      1
DB          0 B         17 GiB      0 B         0 B         0 B         13
GiB      216
SLOW        0 B         0 B         70 MiB      0 B         0 B         53
MiB      1
TOTAL       0 B         17 GiB      70 MiB      0 B         0 B         0 B
         219
MAXIMUMS:
LOG         0 B         22 MiB      0 B         0 B         0 B         18
MiB
WAL         0 B         126 MiB     0 B         0 B         0 B         93
MiB
DB          0 B         48 GiB      0 B         0 B         0 B         16
GiB
SLOW        0 B         1.9 GiB     141 MiB     0 B         0 B         1.5
GiB
TOTAL       0 B         49 GiB      141 MiB     0 B         0 B         0 B
SIZE <<  0 B         48 GiB      8.6 TiB
1 : device size 0xc7fffe000 : using 0x458500000(17 GiB)
2 : device size 0x9187fc00000 : using 0x6b27b860000(6.7 TiB)
RocksDBBlueFSVolumeSelector Usage Matrix:
DEV/LEV     WAL         DB          SLOW        *           *
REAL        FILES
LOG         0 B         14 MiB      0 B         0 B         0 B         12
MiB      1
WAL         0 B         72 MiB      0 B         0 B         0 B         51
MiB      4
DB          0 B         17 GiB      0 B         0 B         0 B         14
GiB      231
SLOW        0 B         0 B         70 MiB      0 B         0 B         60
MiB      1
TOTAL       0 B         17 GiB      70 MiB      0 B         0 B         0 B
         237
MAXIMUMS:
LOG         0 B         22 MiB      0 B         0 B         0 B         18
MiB
WAL         0 B         198 MiB     0 B         0 B         0 B         172
MiB
DB          0 B         48 GiB      0 B         0 B         0 B         20
GiB
SLOW        0 B         1.7 GiB     423 MiB     0 B         0 B         1.1
GiB
TOTAL       0 B         49 GiB      423 MiB     0 B         0 B         0 B
SIZE <<  0 B         48 GiB      8.6 TiB
1 : device size 0xc7fffe000 : using 0x476800000(18 GiB)
2 : device size 0x9187fc00000 : using 0x6a6d46f0000(6.7 TiB)
RocksDBBlueFSVolumeSelector Usage Matrix:
DEV/LEV     WAL         DB          SLOW        *           *
REAL        FILES
LOG         0 B         14 MiB      0 B         0 B         0 B         9.6
MiB     1
WAL         0 B         54 MiB      0 B         0 B         0 B         34
MiB      3
DB          0 B         18 GiB      0 B         0 B         0 B         14
GiB      238
SLOW        0 B         71 MiB      141 MiB     0 B         0 B         5.9
MiB     3
TOTAL       0 B         18 GiB      141 MiB     0 B         0 B         0 B
         245
MAXIMUMS:
LOG         0 B         22 MiB      0 B         0 B         0 B         18
MiB
WAL         0 B         180 MiB     0 B         0 B         0 B         155
MiB
DB          0 B         41 GiB      0 B         0 B         0 B         19
GiB
SLOW        0 B         6.5 GiB     2.5 GiB     0 B         0 B         6.8
GiB
TOTAL       0 B         43 GiB      2.5 GiB     0 B         0 B         0 B
SIZE <<  0 B         48 GiB      8.6 TiB
1 : device size 0xc7fffe000 : using 0x405100000(16 GiB)
2 : device size 0x9187fc00000 : using 0x5bb806e0000(5.7 TiB)
RocksDBBlueFSVolumeSelector Usage Matrix:
DEV/LEV     WAL         DB          SLOW        *           *
REAL        FILES
LOG         0 B         18 MiB      0 B         0 B         0 B         13
MiB      1
WAL         0 B         36 MiB      0 B         0 B         0 B         15
MiB      2
DB          0 B         16 GiB      0 B         0 B         0 B         13
GiB      210
SLOW        0 B         0 B         141 MiB     0 B         0 B         71
MiB      2
TOTAL       0 B         16 GiB      141 MiB     0 B         0 B         0 B
         215
MAXIMUMS:
LOG         0 B         22 MiB      0 B         0 B         0 B         18
MiB
WAL         0 B         126 MiB     0 B         0 B         0 B         93
MiB
DB          0 B         48 GiB      0 B         0 B         0 B         16
GiB
SLOW        0 B         2.0 GiB     141 MiB     0 B         0 B         1.8
GiB
TOTAL       0 B         50 GiB      141 MiB     0 B         0 B         0 B
SIZE <<  0 B         48 GiB      8.6 TiB
1 : device size 0xc7fffe000 : using 0x3cdd00000(15 GiB)
2 : device size 0x9187fc00000 : using 0x5bb9b2e0000(5.7 TiB)
RocksDBBlueFSVolumeSelector Usage Matrix:
DEV/LEV     WAL         DB          SLOW        *           *
REAL        FILES
LOG         0 B         6 MiB       0 B         0 B         0 B         3.4
MiB     1
WAL         0 B         108 MiB     0 B         0 B         0 B         78
MiB      6
DB          0 B         15 GiB      0 B         0 B         0 B         12
GiB      202
SLOW        0 B         142 MiB     70 MiB      0 B         0 B         34
MiB      3
TOTAL       0 B         15 GiB      70 MiB      0 B         0 B         0 B
         212
MAXIMUMS:
LOG         0 B         22 MiB      0 B         0 B         0 B         18
MiB
WAL         0 B         126 MiB     0 B         0 B         0 B         93
MiB
DB          0 B         49 GiB      563 MiB     0 B         0 B         16
GiB
SLOW        0 B         1014 MiB    323 MiB     0 B         0 B         895
MiB
TOTAL       0 B         50 GiB      886 MiB     0 B         0 B         0 B
SIZE <<  0 B         48 GiB      8.6 TiB
On Mon, Aug 26, 2024 at 11:01 AM Igor Fedotov<igor.fedotov@xxxxxxxx>  wrote:

Hi Ruben,

it could be nice if you share 'ceph tell osd.N bluefs stats' command
output for these OSDs.

Also you might want to read  the following thread
https://www.spinics.net/lists/ceph-users/msg79062.html

which describes using 'ceph-volume lvm migrate' (or its counterpart in
ceph-bluestore-tool) to migrate BlueFS data from slow to DB volume.

The latter might have temporary or permanent impact depending  on the
spillover root cause though.


Thanks,

Igor

On 8/26/2024 10:08 AM, Ruben Bosch wrote:
Hi all,

ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef
(stable)
We are working on marking out OSDs on a host with EC4+2. The OSDs are
HDDs.
The OSDs have a separate DB on an NVMe disk. All the operations take
ages.
After some time we see BLUEFS_SPILLOVER. Telling the mentioned OSDs to
compact sometimes helps, but not always. The OSDs have plenty space
remaining in the db but the spillover does not disappear.

[WRN] BLUEFS_SPILLOVER: 2 OSD(s) experiencing BlueFS spillover
       osd.91 spilled over 141 MiB metadata from 'db' device (15 GiB used
of
50 GiB) to slow device
       osd.106 spilled over 70 MiB metadata from 'db' device (12 GiB used
of
50 GiB) to slow device

Has anyone seen similar behavior before and have they found a workaround
or
solution?

Kind regards,

Ruben Bosch
_______________________________________________
ceph-users mailing list --ceph-users@xxxxxxx
To unsubscribe send an email toceph-users-leave@xxxxxxx
--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us athttps://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:https://croit.io  | YouTube:https://goo.gl/PGE1Bx


_______________________________________________
ceph-users mailing list --ceph-users@xxxxxxx
To unsubscribe send an email toceph-users-leave@xxxxxxx

--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us athttps://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:https://croit.io  | YouTube:https://goo.gl/PGE1Bx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux