Re: Poor ceph cluster performance

Paul Emmerich <paul.emmerich@xxxxxxxx> · Thu, 29 Nov 2018 00:08:48 +0100

Cody <codeology.lab@xxxxxxxxx>:
>
> > And this exact problem was one of the reasons why we migrated
> > everything to PXE boot where the OS runs from RAM.
>
> Hi Paul,
>
> I totally agree with and admire your diskless approach. If I may ask,
> what kind of OS image do you use? 1GB footprint sounds really small.

It's based on Debian, because Debian makes live boot really easy with
squashfs + overlayfs.
We also have a half-finished CentOS/RHEL-based version somewhere, but
that requires way more RAM because it doesn't use overlayfs (or didn't
when we last checked, I guess we need to check RHEL 8 again)

Current image size is 400 MB + 30 MB for kernel + initrd and it comes
with everything you need for Ceph. We don't even run aggressive
compression on the squashfs, it's just lzo.

You can test it for yourself in a VM: https://croit.io/croit-virtual-demo

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

>
> On Tue, Nov 27, 2018 at 1:53 PM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
> >
> > And this exact problem was one of the reasons why we migrated
> > everything to PXE boot where the OS runs from RAM.
> > That kind of failure is just the worst to debug...
> > Also, 1 GB of RAM is cheaper than a separate OS disk.
> >
> > --
> > Paul Emmerich
> >
> > Looking for help with your Ceph cluster? Contact us at https://croit.io
> >
> > croit GmbH
> > Freseniusstr. 31h
> > 81247 München
> > www.croit.io
> > Tel: +49 89 1896585 90
> >
> > Am Di., 27. Nov. 2018 um 19:22 Uhr schrieb Cody <codeology.lab@xxxxxxxxx>:
> > >
> > > Hi everyone,
> > >
> > > Many, many thanks to all of you!
> > >
> > > The root cause was due to a failed OS drive on one storage node. The
> > > server was responsive to ping, but unable to login. After a reboot via
> > > IPMI, docker daemon failed to start due to I/O errors and dmesg
> > > complained about the failing OS disk. I failed to catch the problem
> > > initially since  'ceph -s' kept showing HEALTH and the cluster was
> > > "functional" despite of slow performance.
> > >
> > > I really appreciate all the tips and advices received from you all and
> > > learned a lot. I will carry your advices (e.g. using bluestore,
> > > enterprise ssd/hdd, separating public and cluster traffics, etc) into
> > > my next round PoC.
> > >
> > > Thank you very much!
> > >
> > > Best regards,
> > > Cody
> > >
> > > On Tue, Nov 27, 2018 at 6:31 AM Vitaliy Filippov <vitalif@xxxxxxxxxx> wrote:
> > > >
> > > > > CPU: 2 x E5-2603 @1.8GHz
> > > > > RAM: 16GB
> > > > > Network: 1G port shared for Ceph public and cluster traffics
> > > > > Journaling device: 1 x 120GB SSD (SATA3, consumer grade)
> > > > > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade)
> > > >
> > > > 0.84 MB/s sequential write is impossibly bad, it's not normal with any
> > > > kind of devices and even with 1G network, you probably have some kind of
> > > > problem in your setup - maybe the network RTT is very high or maybe osd or
> > > > mon nodes are shared with other running tasks and overloaded or maybe your
> > > > disks are already dead... :))
> > > >
> > > > > As I moved on to test block devices, I got a following error message:
> > > > >
> > > > > # rbd map image01 --pool testbench --name client.admin
> > > >
> > > > You don't need to map it to run benchmarks, use `fio --ioengine=rbd`
> > > > (however you'll still need /etc/ceph/ceph.client.admin.keyring)
> > > >
> > > > --
> > > > With best regards,
> > > >    Vitaliy Filippov
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@xxxxxxxxxxxxxx
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com