Re: DB/WALL and RGW index on the same NVME

"David Orman" <ormandj@xxxxxxxxxxxx> · Mon, 08 Apr 2024 08:46:59 -0500

I would suggest that you might consider EC vs. replication for index data, and the latency implications. There's more than just the nvme vs. rotational discussion to entertain, especially if using the more widely spread EC modes like 8+3. It would be worth testing for your particular workload.

Also make sure to factor in storage utilization if you expect to see versioning/object lock in use. This can be the source of a significant amount of additional consumption that isn't planned for initially.

On Mon, Apr 8, 2024, at 01:42, Daniel Parkes wrote:
> Hi Lukasz,
>
> RGW uses Omap objects for the index pool; Omaps are stored in Rocksdb
> database of each osd, not on the actual index pool, so by putting DB/WALL
> on an NVMe as you mentioned, you are already configuring the index pool on
> a non-rotational drive, you don't need to do anything else.
>
> You just need to size your DB/WALL partition accordingly. For RGW/object
> storage, a good starting point for the DB/Wall sizing is 4%.
>
> Example of Omap entries in the index pool using 0 bytes, as they are stored
> in Rocksdb:
>
> # rados -p default.rgw.buckets.index listomapkeys
> .dir.7fb0a3df-9553-4a76-938d-d23711e67677.34162.1.2
> file1
> file2
> file4
> file10
>
> rados df -p default.rgw.buckets.index
> POOL_NAME                  USED  OBJECTS  CLONES  COPIES
> MISSING_ON_PRIMARY  UNFOUND  DEGRADED  RD_OPS       RD  WR_OPS      WR
>  USED COMPR  UNDER COMPR
> default.rgw.buckets.index   0 B       11       0      33
>     0        0         0     208  207 KiB      41  20 KiB         0 B
>         0 B
>
> # rados -p default.rgw.buckets.index stat
> .dir.7fb0a3df-9553-4a76-938d-d23711e67677.34162.1.2
> default.rgw.buckets.index/.dir.7fb0a3df-9553-4a76-938d-d23711e67677.34162.1.2
> mtime 2022-12-20T07:32:11.000000-0500, size 0
>
>
> On Sun, Apr 7, 2024 at 10:06 PM Lukasz Borek <lukasz@xxxxxxxxxxxx> wrote:
>
>> Hi!
>>
>> I'm working on a POC cluster setup dedicated to backup app writing objects
>> via s3 (large objects, up to 1TB transferred via multipart upload process).
>>
>> Initial setup is 18 storage nodes (12HDDs + 1 NVME card for DB/WALL) + EC
>> pool.  Plan is to use cephadm.
>>
>> I'd like to follow good practice and put the RGW index pool on a
>> no-rotation drive. Question is how to do it?
>>
>>    - replace a few HDDs (1 per node) with a SSD (how many? 4-6-8?)
>>    - reserve space on NVME drive on each node, create lv based OSD and let
>>    rgb index use the same NVME drive as DB/WALL
>>
>> Thoughts?
>>
>> --
>> Lukasz
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx