Re: Add OSD with primary on HDD, WAL and DB on SSD

Zhenshi Zhou <deaderzzs@xxxxxxxxx> · Fri, 28 Aug 2020 15:19:49 +0800

In my deployment I part the disk for wal and db seperately as I can assign
the size manually.
For example, I assign every two partitions of nvme with 30G and 2G, for
each osd's wal and db.

Tony Liu <tonyliu0592@xxxxxxxxxxx> 于2020年8月28日周五 上午1:53写道：

> How's WAL utilize disk when it shares the same device with DB?
> Say device size 50G, 100G, 200G, they are no difference to DB
> because DB will take 30G anyways. Does it make any difference
> to WAL?
>
> Thanks!
> Tony
> > -----Original Message-----
> > From: Zhenshi Zhou <deaderzzs@xxxxxxxxx>
> > Sent: Wednesday, August 26, 2020 11:16 PM
> > To: Tony Liu <tonyliu0592@xxxxxxxxxxx>
> > Cc: Anthony D'Atri <anthony.datri@xxxxxxxxx>; ceph-users@xxxxxxx
> > Subject: Re:  Re: Add OSD with primary on HDD, WAL and DB on
> > SSD
> >
> > Official document says that you should allocate 4% of the slow device
> > space for block.db.
> >
> > But the main problem is that Bluestore uses RocksDB and RocksDB puts a
> > file on the fast device only if it thinks that the whole layer will fit
> > there.
> >
> > As for RocksDB, L1 is about 300M, L2 is about 3G, L3 is near 30G, and L4
> > is about 300G.
> > For instance, RocksDB puts L2 files to block.db only if it’s at least 3G
> > there.
> > As a result, 30G is a acceptable value.
> >
> > Tony Liu <tonyliu0592@xxxxxxxxxxx <mailto:tonyliu0592@xxxxxxxxxxx> >
> > 于2020年8月25日周二 上午10:49写道：
> >
> >
> >       > -----Original Message-----
> >       > From: Anthony D'Atri <anthony.datri@xxxxxxxxx
> > <mailto:anthony.datri@xxxxxxxxx> >
> >       > Sent: Monday, August 24, 2020 7:30 PM
> >       > To: Tony Liu <tonyliu0592@xxxxxxxxxxx
> > <mailto:tonyliu0592@xxxxxxxxxxx> >
> >       > Subject: Re:  Re: Add OSD with primary on HDD, WAL
> > and DB on
> >       > SSD
> >       >
> >       > Why such small HDDs?  Kinda not worth the drive bays and power,
> > instead
> >       > of the complexity of putting WAL+DB on a shared SSD, might you
> > have been
> >       > able to just buy SSDs and not split? ymmv.
> >
> >       2TB is for testing, it will bump up to 10TB for production.
> >
> >       > The limit is a function of the way the DB levels work, it’s not
> >       > intentional.
> >       >
> >       > WAL by default takes a fixed size, like 512 MB or something.
> >       >
> >       > 64 GB is a reasonable size, it accomodates the WAL and allows
> > space for
> >       > DB compaction without overflowing.
> >
> >       For each 10TB HDD, what's the recommended DB device size for both
> >       DB and WAL? The doc recommends 1% - 4%, meaning 100GB - 400GB for
> >       each 10TB HDD. But given the WAL data size and DB data size, I am
> >       not sure if that 100GB - 400GB will be used efficiently.
> >
> >       > With this commit the situation should be improved, though you
> > don’t
> >       > mention what release you’re running
> >       >
> >       > https://github.com/ceph/ceph/pull/29687
> >
> >       I am using ceph version 15.2.4 octopus (stable).
> >
> >       Thanks!
> >       Tony
> >
> >       > >>>  I don't need to create
> >       > >>> WAL device, just primary on HDD and DB on SSD, and WAL will
> > be using
> >       > >>> DB device cause it's faster. Is that correct?
> >       > >>
> >       > >> Yes.
> >       > >>
> >       > >>
> >       > >> But be aware that the DB sizes are limited to 3GB, 30GB and
> > 300GB.
> >       > >> Anything less than those sizes will have a lot of untilised
> > space,
> >       > >> e.g a 20GB device will only utilise 3GB.
> >       > >
> >       > > I have 1 480GB SSD and 7 2TB HDDs. 7 LVs are created on SSD,
> > each is
> >       > > about 64GB, for 7 OSDs.
> >       > >
> >       > > Since it's shared by DB and WAL, DB will take 30GB and WAL will
> > take
> >       > > the rest 34GB. Is that correct?
> >       > >
> >       > > Is that size of DB and WAL good for 2TB HDD (block store and
> > object
> >       > > store cases)?
> >       > >
> >       > > Could you share a bit more about the intention of such limit?
> >       > >
> >       > >
> >       > > Thanks!
> >       > > Tony
> >       > > _______________________________________________
> >       > > ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-
> > users@xxxxxxx>  To unsubscribe send an
> >       > > email to ceph-users-leave@xxxxxxx <mailto:ceph-users-
> > leave@xxxxxxx>
> >
> >       _______________________________________________
> >       ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-
> > users@xxxxxxx>
> >       To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > <mailto:ceph-users-leave@xxxxxxx>
> >
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx