Re: Best practice and expected benefits of using separate WAL and DB devices with Bluestore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I’m going to mainly answer the practical questions Niklaus had.

Our standart setup is 12HDDs and 2 Enterprise NVMe per node which means we have 6 OSDs per 1 NVMe. For the partition we use LVM.

The fact that one one failed NVMe takes down 6 OSDs isn’t great but our osd-node count is more then double the M + K values for Erasure coding which means 6 OSDs should be ok-ish. Failing multiple NVMe could be an issue. If you use replicated pools then this isn’t that problematic.

When it comes to recovery Ceph can easily recover that. Just recreate the LVMs and OSDs and you are good to go.

One other benefit for us is that because we use large NVMes (7.7TiB) we can use the spare space for a fast pool.

Ondrej

> On 19. 4. 2024, at 12:04, Torkil Svensgaard <torkil@xxxxxxxx> wrote:
> 
> Hi
> 
> Red Hat Ceph support told us back in the day that 16 DB/WAL partitions pr NVMe were the max supported by RHCS because their testing showed performance suffered beyond that. We are running with 11 pr NVMe.
> 
> We are prepared to lose a bunch of OSDs if we have an NVMe die. We expect ceph will handle it and we can redeploy the OSDs with a new NVMe device.
> 
> We use a service spec for the chopping up bit:
> 
> service_type: osd
> service_id: slow
> service_name: osd.slow
> placement:
>  host_pattern: '*'
> spec:
>  block_db_size: 290966113186
>  data_devices:
>    rotational: 1
>  db_devices:
>    rotational: 0
>    size: '1000G:'
>  filter_logic: AND
>  objectstore: bluestore
> 
> Mvh.
> 
> Torkil
> 
> On 19-04-2024 11:02, Niklaus Hofer wrote:
>> Dear all
>> We have an HDD ceph cluster that could do with some more IOPS. One solution we are considering is installing NVMe SSDs into the storage nodes and using them as WAL- and/or DB devices for the Bluestore OSDs.
>> However, we have some questions about this and are looking for some guidance and advice.
>> The first one is about the expected benefits. Before we undergo the efforts involved in the transition, we are wondering if it is even worth it. How much of a performance boost one can expect when adding NVMe SSDs for WAL-devices to an HDD cluster? Plus, how much faster than that does it get with the DB also being on SSD. Are there rule-of-thumb number of that? Or maybe someone has done benchmarks in the past?
>> The second question is of more practical nature. Are there any best- practices on how to implement this? I was thinking we won't do one SSD per HDD - surely an NVMe SSD is plenty fast to handle the traffic from multiple OSDs. But what is a good ratio? Do I have one NVMe SSD per 4 HDDs? Per 6 or even 8? Also, how should I chop-up the SSD, using partitions or using LVM? Last but not least, if I have one SSD handle WAL and DB for multiple OSDs, losing that SSD means losing multiple OSDs. How do people deal with this risk? Is it generally deemed acceptable or is this something people tend to mitigate and if so how? Do I run multiple SSDs in RAID?
>> I do realize that for some of these, there might not be the one perfect answer that fits all use cases. I am looking for best practices and in general just trying to avoid any obvious mistakes.
>> Any advice is much appreciated.
>> Sincerely
>> Niklaus Hofer
> 
> -- 
> Torkil Svensgaard
> Systems Administrator
> Danish Research Centre for Magnetic Resonance DRCMR, Section 714
> Copenhagen University Hospital Amager and Hvidovre
> Kettegaard Allé 30, 2650 Hvidovre, Denmark
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
> To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux