Re: bluefs _allocate unable to allocate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jose,

In fact 48GB is a way too much for WAL drive - usually the write ahead log tend to be 2-4 GBs.

But in your case it's ~150GB, while DB itself is very small (146MB!!!):

WAL         45 GiB      111 GiB     0 B         0 B         0 B       154 GiB     2400

DB          0 B         164 MiB     0 B         0 B         0 B       146 MiB     30


which means that there are some issues with RocksDB's WAL processing, which needs some troubleshooting...

Curious if other OSDs are suffering from the same and whether you have any custom settings for your OSD(s)?

Additionally you might want to try the following command to compact this specific OSD manually and check if this would normalize the DB layout - the majority of data has to be at DB level not WAL. Please share the resulting layout (reported by "ceph daemon osd.2 bluefs stats" command) after the compaction is fulfiled and OSD is restarted.

The compaction command to be applied on an offline OSD: "ceph-kvstore-tool bluestore-kv <path-to-osd> compact"

Even if the above works great please refrain from applying that compaction to every OSD - let's see how that "compacted" OSD evolves.Would WAL grow again or not?

Thanks,

Igor






On 10/6/2021 1:35 PM, José H. Freidhof wrote:
Hello Igor,

yes the volume is  nvme wal partitions for the bluestore devicegroups are only 48gb each

on each osd node are 1 nvme with 1tb splitted in 20 lvs with 48gb (WAL)
on each osd node are 4 ssd with 1tb splitted in 5 lvs with 175gb  (rock.db)
on each osd node are 20 hdd with 5.5tb with 1 lvs (block.db)

each blustore have 1 partition nvme,ssd and hdd like described in the documentation https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/ <https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/>

is this to small or can i adjust the max allocation on the wal nvme device in the ceph configuration? i know that the ssd and nvme are to small for those 5.5tb disk... its 1% only ot the rotation disk. i am new in ceph and still or always learning, but we are in a little hurry because our other datastores are old and full.

root@cd88-ceph-osdh-01:/# ceph daemon osd.2 bluestore bluefs device info
{
    "dev": {
        "device": "BDEV_WAL",
        "total": 48318377984,
        "free": 1044480,
        "bluefs_used": 48317333504
    },
    "dev": {
        "device": "BDEV_DB",
        "total": 187904811008,
        "free": 68757217280,
        "bluefs_used": 119147593728
    },
    "dev": {
        "device": "BDEV_SLOW",
        "total": 6001172414464,
        "free": 5624912359424,
        "bluefs_used": 0,
        "bluefs max available": 5624401231872
    }
}
root@cd88-ceph-osdh-01:/# ceph daemon osd.2 bluefs stats
0 : device size 0xb3ffff000 : using 0xb3ff00000(45 GiB)
1 : device size 0x2bbfffe000 : using 0x1bbeb00000(111 GiB)
2 : device size 0x57541c00000 : using 0x579b592000(350 GiB)
RocksDBBlueFSVolumeSelector: wal_total:45902462976, db_total:178509578240, slow_total:5701113793740, db_avail:103884521472
Usage matrix:
DEV/LEV     WAL         DB          SLOW        *           *         REAL        FILES LOG         124 MiB     2.3 GiB     0 B         0 B         0 B         7.5 MiB     1 WAL         45 GiB      111 GiB     0 B         0 B         0 B         154 GiB     2400 DB          0 B         164 MiB     0 B         0 B         0 B         146 MiB     30 SLOW        0 B         0 B         0 B         0 B         0 B         0 B         0 TOTALS      45 GiB      113 GiB     0 B         0 B         0 B         0 B         2431
MAXIMUMS:
LOG         124 MiB     2.3 GiB     0 B         0 B         0 B         17 MiB WAL         45 GiB      149 GiB     0 B         0 B         0 B         192 GiB DB          0 B         762 MiB     0 B         0 B         0 B         741 MiB SLOW        0 B         0 B         0 B         0 B         0 B         0 B TOTALS      45 GiB      150 GiB     0 B         0 B         0 B         0 B

Am Mi., 6. Okt. 2021 um 11:45 Uhr schrieb Igor Fedotov <ifedotov@xxxxxxx <mailto:ifedotov@xxxxxxx>>:

    Hey Jose,

    it looks like your WAL volume is out of space which looks weird given
    its capacity = 48Gb.

    Could you please share the output of the following commands:

    ceph daemon osd.N bluestore bluefs device info

    ceph daemon osd.N bluefs stats


    Thanks,

    Igor


    On 10/6/2021 12:24 PM, José H. Freidhof wrote:
    > Hello together
    >
    > we have a running ceph pacific 16.2.5 cluster and i found this
    messages in
    > the service logs of the osd daemons.
    >
    > we have three osd nodes .. each node has 20osds as bluestore with
    > nvme/ssd/hdd
    >
    > is this a bug or maybe i have some settings wrong?
    >
    >
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:25.821+0000
    > 7f38eebd4700  1 bluefs _allocate unable to allocate 0x100000 on
    bdev 0,
    > allocator name bluefs-wal, allocator type hybrid, capacity
    0xb40000000,
    > block size 0x100000, free 0xff000, fragmentation 0, allocated 0x0
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:29.857+0000
    > 7f38eebd4700  1 bluefs _allocate unable to allocate 0x100000 on
    bdev 0,
    > allocator name bluefs-wal, allocator type hybrid, capacity
    0xb40000000,
    > block size 0x100000, free 0xff000, fragmentation 0, allocated 0x0
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.073+0000
    > 7f38eebd4700  1 bluefs _allocate unable to allocate 0x400000 on
    bdev 0,
    > allocator name bluefs-wal, allocator type hybrid, capacity
    0xb40000000,
    > block size 0x100000, free 0xff000, fragmentation 0, allocated 0x0
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.405+0000
    > 7f38eebd4700  1 bluefs _allocate unable to allocate 0x100000 on
    bdev 0,
    > allocator name bluefs-wal, allocator type hybrid, capacity
    0xb40000000,
    > block size 0x100000, free 0xff000, fragmentation 0, allocated 0x0
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.465+0000
    > 7f38eebd4700  1 bluefs _allocate unable to allocate 0x100000 on
    bdev 0,
    > allocator name bluefs-wal, allocator type hybrid, capacity
    0xb40000000,
    > block size 0x100000, free 0xff000, fragmentation 0, allocated 0x0
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.529+0000
    > 7f38eebd4700  1 bluefs _allocate unable to allocate 0x100000 on
    bdev 0,
    > allocator name bluefs-wal, allocator type hybrid, capacity
    0xb40000000,
    > block size 0x100000, free 0xff000, fragmentation 0, allocated 0x0
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.545+0000
    > 7f38eebd4700  4 rocksdb: [db_impl/db_impl_write.cc:1668] [L] New
    memtable
    > created with log file: #9588. Immutable memtables: 1.
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.545+0000
    > 7f38eebd4700  1 bluefs _allocate unable to allocate 0x100000 on
    bdev 0,
    > allocator name bluefs-wal, allocator type hybrid, capacity
    0xb40000000,
    > block size 0x100000, free 0xff000, fragmentation 0, allocated 0x0
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.545+0000
    > 7f3905c02700  4 rocksdb: (Original Log Time
    2021/10/06-09:17:30.547575)
    > [db_impl/db_impl_compaction_flush.cc:2198] Calling
    > FlushMemTableToOutputFile with column family [L], flush slots
    available 1,
    > compaction slots available 1, flush slots scheduled 1,
    compaction slots
    > scheduled 0
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.545+0000
    > 7f3905c02700  4 rocksdb: [flush_job.cc:321] [L] [JOB 5709] Flushing
    > memtable with next log file: 9587
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.545+0000
    > 7f3905c02700  4 rocksdb: [flush_job.cc:321] [L] [JOB 5709] Flushing
    > memtable with next log file: 9588
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.545+0000
    > 7f3905c02700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850547916,
    > "job": 5709, "event": "flush_started", "num_memtables": 2,
    "num_entries":
    > 4146, "num_deletes": 0, "total_data_size": 127203926,
    "memory_usage":
    > 130479920, "flush_reason": "Write Buffer Full"}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.545+0000
    > 7f3905c02700  4 rocksdb: [flush_job.cc:350] [L] [JOB 5709]
    Level-0 flush
    > table #9589: started
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.557+0000
    > 7f3905c02700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850559292,
    > "cf_name": "L", "job": 5709, "event": "table_file_creation",
    "file_number":
    > 9589, "file_size": 3249934, "table_properties": {"data_size":
    3247855,
    > "index_size": 1031, "index_partitions": 0,
    "top_level_index_size": 0,
    > "index_key_is_user_key": 0, "index_value_is_delta_encoded": 0,
    > "filter_size": 197, "raw_key_size": 1088,
    "raw_average_key_size": 16,
    > "raw_value_size": 3246252, "raw_average_value_size": 47739,
    > "num_data_blocks": 36, "num_entries": 68, "num_deletions": 32,
    > "num_merge_operands": 0, "num_range_deletions": 0,
    "format_version": 0,
    > "fixed_key_len": 0, "filter_policy": "rocksdb.BuiltinBloomFilter",
    > "column_family_name": "L", "column_family_id": 10, "comparator":
    > "leveldb.BytewiseComparator", "merge_operator": "nullptr",
    > "prefix_extractor_name": "nullptr", "property_collectors": "[]",
    > "compression": "NoCompression", "compression_options":
    "window_bits=-14;
    > level=32767; strategy=0; max_dict_bytes=0; zstd_max_train_bytes=0;
    > enabled=0; ", "creation_time": 1633511730, "oldest_key_time":
    1633511730,
    > "file_creation_time": 1633511850}}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.557+0000
    > 7f3905c02700  4 rocksdb: [flush_job.cc:401] [L] [JOB 5709]
    Level-0 flush
    > table #9589: 3249934 bytes OK
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.557+0000
    > 7f3905c02700  4 rocksdb: (Original Log Time
    2021/10/06-09:17:30.559362)
    > [memtable_list.cc:447] [L] Level-0 commit table #9589 started
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.557+0000
    > 7f3905c02700  4 rocksdb: (Original Log Time
    2021/10/06-09:17:30.559583)
    > [memtable_list.cc:503] [L] Level-0 commit table #9589: memtable
    #1 done
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.557+0000
    > 7f3905c02700  4 rocksdb: (Original Log Time
    2021/10/06-09:17:30.559586)
    > [memtable_list.cc:503] [L] Level-0 commit table #9589: memtable
    #2 done
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.557+0000
    > 7f3905c02700  4 rocksdb: (Original Log Time
    2021/10/06-09:17:30.559601)
    > EVENT_LOG_v1 {"time_micros": 1633511850559593, "job": 5709, "event":
    > "flush_finished", "output_compression": "NoCompression",
    "lsm_state": [8,
    > 1, 0, 0, 0, 0, 0], "immutable_memtables": 0}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.557+0000
    > 7f3905c02700  4 rocksdb: (Original Log Time
    2021/10/06-09:17:30.559638)
    > [db_impl/db_impl_compaction_flush.cc:205] [L] Level summary:
    files[8 1 0 0
    > 0 0 0] max score 1.00
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.557+0000
    > 7f38fb3ed700  4 rocksdb: [compaction/compaction_job.cc:1676] [L]
    [JOB 5710]
    > Compacting 8@0 + 1@1 files to L1, score 1.00
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.557+0000
    > 7f38fb3ed700  4 rocksdb: [compaction/compaction_job.cc:1680] [L]
    Compaction
    > start summary: Base version 3090 Base level 0, inputs: [9589(3173KB)
    > 9586(4793KB) 9583(1876KB) 9580(194KB) 9576(6417KB) 9573(1078KB)
    9570(405KB)
    > 9567(29KB)], [9564(1115KB)]
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.557+0000
    > 7f38fb3ed700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850559956,
    > "job": 5710, "event": "compaction_started", "compaction_reason":
    > "LevelL0FilesNum", "files_L0": [9589, 9586, 9583, 9580, 9576,
    9573, 9570,
    > 9567], "files_L1": [9564], "score": 1, "input_data_size": 19542092}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: [compaction/compaction_job.cc:1349] [L]
    [JOB 5710]
    > Generated table #9590: 36 keys, 3249524 bytes
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850582987,
    > "cf_name": "L", "job": 5710, "event": "table_file_creation",
    "file_number":
    > 9590, "file_size": 3249524, "table_properties": {"data_size":
    3247449,
    > "index_size": 1031, "index_partitions": 0,
    "top_level_index_size": 0,
    > "index_key_is_user_key": 0, "index_value_is_delta_encoded": 0,
    > "filter_size": 197, "raw_key_size": 576, "raw_average_key_size": 16,
    > "raw_value_size": 3246252, "raw_average_value_size": 90173,
    > "num_data_blocks": 36, "num_entries": 36, "num_deletions": 0,
    > "num_merge_operands": 0, "num_range_deletions": 0,
    "format_version": 0,
    > "fixed_key_len": 0, "filter_policy": "rocksdb.BuiltinBloomFilter",
    > "column_family_name": "L", "column_family_id": 10, "comparator":
    > "leveldb.BytewiseComparator", "merge_operator": "nullptr",
    > "prefix_extractor_name": "nullptr", "property_collectors": "[]",
    > "compression": "NoCompression", "compression_options":
    "window_bits=-14;
    > level=32767; strategy=0; max_dict_bytes=0; zstd_max_train_bytes=0;
    > enabled=0; ", "creation_time": 1633471854, "oldest_key_time": 0,
    > "file_creation_time": 1633511850}}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: [compaction/compaction_job.cc:1415] [L]
    [JOB 5710]
    > Compacted 8@0 + 1@1 files to L1 => 3249524 bytes
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: (Original Log Time
    2021/10/06-09:17:30.583469)
    > [compaction/compaction_job.cc:760] [L] compacted to: files[0 1 0
    0 0 0 0]
    > max score 0.01, MB/sec: 846.1 rd, 140.7 wr, level 1, files in(8,
    1) out(1)
    > MB in(17.5, 1.1) out(3.1), read-write-amplify(1.2)
    write-amplify(0.2) OK,
    > records in: 376, records dropped: 340 output_compression:
    NoCompression
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: (Original Log Time
    2021/10/06-09:17:30.583498)
    > EVENT_LOG_v1 {"time_micros": 1633511850583485, "job": 5710, "event":
    > "compaction_finished", "compaction_time_micros": 23098,
    > "compaction_time_cpu_micros": 20039, "output_level": 1,
    "num_output_files":
    > 1, "total_output_size": 3249524, "num_input_records": 376,
    > "num_output_records": 36, "num_subcompactions": 1,
    "output_compression":
    > "NoCompression", "num_single_delete_mismatches": 0,
    > "num_single_delete_fallthrough": 0, "lsm_state": [0, 1, 0, 0, 0,
    0, 0]}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850583615,
    > "job": 5710, "event": "table_file_deletion", "file_number": 9589}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850583648,
    > "job": 5710, "event": "table_file_deletion", "file_number": 9586}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850583675,
    > "job": 5710, "event": "table_file_deletion", "file_number": 9583}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850583709,
    > "job": 5710, "event": "table_file_deletion", "file_number": 9580}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850583739,
    > "job": 5710, "event": "table_file_deletion", "file_number": 9576}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850583769,
    > "job": 5710, "event": "table_file_deletion", "file_number": 9573}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850583804,
    > "job": 5710, "event": "table_file_deletion", "file_number": 9570}
    > cd88-ceph-osdh-01 bash[6283]: debug 2021-10-06T09:17:30.581+0000
    > 7f38fb3ed700  4 rocksdb: EVENT_LOG_v1 {"time_micros":
    1633511850583835,
    > "job": 5710, "event": "table_file_deletion", "file_number": 9567}
    > _______________________________________________
    > ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>
    > To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>



--

Mit freundlichen Grüßen,

 -

José H. Freidhof

Reyerhütterstrasse 130b
41065 Mönchengladbach
eMail: harald.freidhof@xxxxxxxxx <mailto:harald.freidhof@xxxxxxxxx>
mobil: +49 (0) 1523 – 717 7801

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux