Re: SAS vs SATA for OSD - WAL+DB sizing.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mark,

We are running a mix of RGW, RDB, and CephFS.  Our CephFS is pretty big,
but we're moving a lot of it to RGW.  What prompted me to go looking for a
guideline was a high frequency of Spillover warnings as our cluster filled
up past the 50% mark.  That was with 14.2.9, I think.  I understand that
some things have changed since, but I think I'd like to have the
flexibility and performance of a generous WAL+DB - the cluster is used to
store research data, and the usage pattern is tending to change as the
research evolves.  No telling what our mix will be a year from now.

-Dave

--
Dave Hall
Binghamton University
kdhall@xxxxxxxxxxxxxx
607-760-2328 (Cell)
607-777-4641 (Office)


On Thu, Jun 3, 2021 at 7:39 PM Mark Nelson <mnelson@xxxxxxxxxx> wrote:

> FWIW, those guidelines try to be sort of a one-size-fits-all
> recommendation that may not apply to your situation.  Typically RBD has
> pretty low metadata overhead so you can get away with smaller DB
> partitions.  4% should easily be enough.  If you are running heavy RGW
> write workloads with small objects, you will almost certainly use more
> than 4% for metadata (I've seen worst case up to 50%, but that was
> before column family sharding which should help to some extent).  Having
> said that, bluestore will roll the higher rocksdb levels over to the
> slow device and keep the wall, L0, and other lower LSM levels on the
> fast device.  It's not necessarily the end of the world if you end up
> with some of the more rarely used metadata on the HDD but having it on
> flash certain is nice.
>
>
> Mark
>
>
> On 6/3/21 5:18 PM, Dave Hall wrote:
> > Anthony,
> >
> > I had recently found a reference in the Ceph docs that indicated
> something
> > like 40GB per TB for WAL+DB space.  For a 12TB HDD that comes out to
> > 480GB.  If this is no longer the guideline I'd be glad to save a couple
> > dollars.
> >
> > -Dave
> >
> > --
> > Dave Hall
> > Binghamton University
> > kdhall@xxxxxxxxxxxxxx
> >
> > On Thu, Jun 3, 2021 at 6:10 PM Anthony D'Atri <anthony.datri@xxxxxxxxx>
> > wrote:
> >
> >> Agreed.  I think oh …. maybe 15-20 years ago there was often a wider
> >> difference between SAS and SATA drives, but with modern queuing etc. my
> >> sense is that there is less of an advantage.  Seek and rotational
> latency I
> >> suspect dwarf interface differences wrt performance.  The HBA may be a
> >> bigger bottleneck (and way more trouble).
> >>
> >> 500 GB NVMe seems like a lot per HDD, are you using that as WAL+DB with
> >> RGW, or as dmcache or something?
> >>
> >> Depending on your constraints, QLC flash might be more competitive than
> >> you think ;)
> >>
> >> — aad
> >>
> >>
> >>> I suspect the behavior of the controller and the behavior of the drive
> >> firmware will end up mattering more than SAS vs SATA.  As always it's
> best
> >> if you can test it first before committing to buying a pile of them.
> >> Historically I have seen SATA drives that have performed well as far as
> >> HDDs go though.
> >>>
> >>> Mark
> >>>
> >>> On 6/3/21 4:25 PM, Dave Hall wrote:
> >>>> Hello,
> >>>>
> >>>> We're planning another batch of OSD nodes for our cluster.  Our prior
> >> nodes
> >>>> have been 8 x 12TB SAS drives plus 500GB NVMe per HDD.  Due to market
> >>>> circumstances and the shortage of drives those 12TB SAS drives are in
> >> short
> >>>> supply.
> >>>>
> >>>> Our integrator has offered an option of 8 x 14TB SATA drives (still
> >>>> Enterprise).  For Ceph, will the switch to SATA carry a performance
> >>>> difference that I should be concerned about?
> >>>>
> >>>> Thanks.
> >>>>
> >>>> -Dave
> >>>>
> >>>> --
> >>>> Dave Hall
> >>>> Binghamton University
> >>>> kdhall@xxxxxxxxxxxxxx
> >>>> _______________________________________________
> >>>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>>>
> >>> _______________________________________________
> >>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux