I've just updated the bug notes.
Most probably the issue is caused by an already fixed bug in RocksDB,
see
https://github.com/facebook/rocksdb/commit/65a9cd616876c7a1204e1a50990400e4e1f61d7e
Hence the question is if we plan to backport the fix and how to arrange
that.
Thanks,
Igor
On 11/28/2017 6:44 PM, Igor Fedotov wrote:
here it is
http://tracker.ceph.com/issues/22264
On 11/28/2017 6:37 PM, Mark Nelson wrote:
Looks like a bug guys! :) Mind making a ticket in the tracker?
Mark
On 11/28/2017 07:39 AM, Igor Fedotov wrote:
Looks like I can easily reproduce that (note slow_used_bytes):
"bluefs": {
"gift_bytes": 105906176,
"reclaim_bytes": 0,
"db_total_bytes": 4294959104,
"db_used_bytes": 76546048,
"wal_total_bytes": 1073737728,
"wal_used_bytes": 239075328,
"slow_total_bytes": 1179648000,
"slow_used_bytes": 63963136,
"num_files": 13,
"log_bytes": 2539520,
"log_compactions": 3,
"logged_bytes": 255176704,
"files_written_wal": 3,
"files_written_sst": 10,
"bytes_written_wal": 1932165189,
"bytes_written_sst": 340957748
},
On 11/28/2017 4:17 PM, Sage Weil wrote:
Hi Shasha,
On Tue, 28 Nov 2017, shasha lu wrote:
Hi, Mark
We test bluestore with 12.2.1.
There are two host in our rgw cluster, each host contain 2 osds. The
rgw pool size is 2. Using a 5GB partition for db.wal, a 50GB SSD
partition for block.db.
# ceph --admin-daemon ceph-osd.1.asok config get rocksdb_db_paths
{
"rocksdb_db_paths": "db,51002736640 db.slow,284999998054"
}
After writing about 400W 4k rgw objects, using ceph-bluestore-tool to
export rocksdb file.
# ceph-bluestore-tool bluefs-export --path /var/lib/ceph/osd/osd1
--out-dir /tmp/osd1
# cd /tmp/osd1
# ls
db db.slow db.wal
# du -sh *
2.8G db
809M db.slow
439M db.wal
block.db partition have 50GB space, but it only contains ~3GB files.
Then the metadata rolling over onto the db.slow.
It seems that only L0-L2 files located in block.db. (L0 256M; L1
256M;
L2 2.5GB), L3 and higher level file located in db.slow.
According to ceph docs, the metadata rolling over onto the db.slow
only when block.db filled up. But in our env the block.db
partition is
far from filled up.
Did I make any mistakes? Is there any additional options should be
set to rocksdb?
You didn't make any mistakes--this should happen automatically. It
looks
like rocksdb isn't behaving as advertised. I've opened
http://tracker.ceph.com/issues/22264 to track this. We need to
start by
reproducing the situation.
My guess is that rocksdb is deciding that deciding that all of L3
can't
fit on db and so it's putting all of L3 on db.slow?
sage
--
To unsubscribe from this list: send the line "unsubscribe
ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html