Re: Questions on Ceph cluster without OS disks

Martin Verges <martin.verges@xxxxxxxx> · Sun, 5 Apr 2020 21:23:21 +0200

Hello Brent,

just use
https://pages.croit.io/croit/v2002/getting-started/installation.html
our free community edition provides all the logic and you can use that to
have a reliable pxe ceph system.

If you want to see it in action, please feel free to contact me and I will
give you a live presentation and answer all your questions.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.verges@xxxxxxxx
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am So., 5. Apr. 2020 um 20:13 Uhr schrieb Brent Kennedy <bkennedy@xxxxxxxxxx
>:

> I agree with the sentiment regarding swap, however it seems the OS devs
> still suggest having a swap, even if its small.  We monitor swap file usage
> and there is none in the ceph clusters, I am mainly looking at eliminating
> it(assuming its “safe” to do so), but don’t want to risk production
> machines just to save some OS space on disk.  However, the idea of loading
> the OS into memory is very interesting to me, at least in the instance of a
> production environment.  Not that it’s a new thing, more so in the use case
> of ceph clusters.  We already run all the command and control on VMs, so
> running the OSD host server OS’s in memory seems like a nifty idea to allow
> us to fully use every disk bay.  We have some older 620s that use an SD
> card on mirror( which is not super reliable in practice ), they might be
> good candidates for this.  I am just wondering how we would drop in the
> correct ceph configuration files during boot without needing to do tons of
> scripting ( the clusters are 15-20 machines ).
>
>
>
> -Brent
>
>
>
> *From:* Martin Verges <martin.verges@xxxxxxxx>
> *Sent:* Sunday, April 5, 2020 3:04 AM
> *To:* Brent Kennedy <bkennedy@xxxxxxxxxx>
> *Cc:* huxiaoyu@xxxxxxxxxxxx; ceph-users <ceph-users@xxxxxxx>
> *Subject:* Re:  Re: Questions on Ceph cluster without OS disks
>
>
>
> Hello Brent,
>
>
>
> no, swap is definitely not needed if you configure systems correctly.
>
> Swap in Ceph kills all your performance and brings a lot of harm to
> clusters. It increases the downtime, decreases the performance and can
> result in much longer recovery times which endangers your data.
>
>
>
> In the very old times, swap was required as you were unable to have enough
> memory in your systems. Today's server does not require a swap partition
> and I personally disable it on all my systems in the past >10y. As my last
> company was a datacenter provider with multiple thousand systems, I
> believe to have quite some insights if that is stable.
>
>
>
> What happens if you run out of memory you might ask? - simple, OOM killer
> kills one process and systemd restarts it, service is back up in a few
> seconds.
>
> Can you choose what process is killed most likely? - yes you can. Take a
> look into /proc/*/oom_adj
>
> What happens if I swap gets filled up? - total destruction ;), your OOM
> killer kills one process, freeing up swap takes a much longer time, system
> load skyrocks, services become unresponsive, Ceph client IO can drop to
> near zero... just save yourself the trouble.
>
>
>
> So yes, we strongly believe to have a far superior system by design
> by just preventing swap at all.
>
>
> --
>
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.verges@xxxxxxxx
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
>
>
>
> Am So., 5. Apr. 2020 um 01:59 Uhr schrieb Brent Kennedy <
> bkennedy@xxxxxxxxxx>:
>
> Forgive me for asking but it seems most OS's require a swap file and when
> I look into doing something similar(meaning not having anything), they all
> say the OS could go unstable without it.  It seems that anyone doing this
> needs to be 100 certain memory will not be used at 100% ever or the OS
> would crash if no swap was there.  How are you getting around this and has
> it ever been a thing?
>
> Also, for the ceph OSDs, where are you storing the osd and host
> configurations ( central storage? )?
>
> Regards,
> -Brent
>
> Existing Clusters:
> Test: Nautilus 14.2.2 with 3 osd servers, 1 mon/man, 1 gateway, 2 iscsi
> gateways ( all virtual on nvme )
> US Production(HDD): Nautilus 14.2.2 with 11 osd servers, 3 mons, 4
> gateways, 2 iscsi gateways
> UK Production(HDD): Nautilus 14.2.2 with 12 osd servers, 3 mons, 4 gateways
> US Production(SSD): Nautilus 14.2.2 with 6 osd servers, 3 mons, 3
> gateways, 2 iscsi gateways
>
>
>
>
> -----Original Message-----
> From: Martin Verges <martin.verges@xxxxxxxx>
> Sent: Sunday, March 22, 2020 3:50 PM
> To: huxiaoyu@xxxxxxxxxxxx
> Cc: ceph-users <ceph-users@xxxxxxx>
> Subject:  Re: Questions on Ceph cluster without OS disks
>
> Hello Samuel,
>
> we from croit.io don't use NFS to boot up Servers. We copy the OS
> directly into the RAM (approximately 0.5-1GB). Think of it like a
> container, you start it and throw it away when you no longer need it.
> This way we can save the slots of OS harddisks to add more storage per
> node and reduce overall costs as 1GB ram is cheaper then an OS disk and
> consumes less power.
>
> If our management node is down, nothing will happen to the cluster. No
> impact, no downtime. However, you do need the mgmt node to boot up the
> cluster. So after a very rare total power outage, your first system would
> be the mgmt node and then the cluster itself. But again, if you configure
> your systems correct, no manual work is required to recover from that. For
> everything else, it is possible (but definitely not needed) to deploy our
> mgmt node in active/passive HA.
>
> We have multiple hundred installations worldwide in production
> environments. Our strong PXE knowledge comes from more than 20 years of
> datacenter hosting experience and it never ever failed us in the last >10
> years.
>
> The main benefits out of that:
>  - Immutable OS freshly booted: Every host has exactly the same version,
> same library, kernel, Ceph versions,...
>  - OS is heavily tested by us: Every croit deployment has exactly the same
> image. We can find errors much faster and hit much fewer errors.
>  - Easy Update: Updating OS, Ceph or anything else is just a node reboot.
> No cluster downtime, No service Impact, full automatic handling by our
> mgmt Software.
>  - No need to install OS: No maintenance costs, no labor required, no
> other OS management required.
>  - Centralized Logs/Stats: As it is booted in memory, all logs and
> statistics are collected on a central place for easy access.
>  - Easy to scale: It doesn't matter if you boot 3 oder 300 nodes, all boot
> the exact same image in a few seconds.
>  .. lots more
>
> Please do not hesitate to contact us directly. We always try to offer an
> excellent service and are strongly customer oriented.
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.verges@xxxxxxxx
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich
> HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
> Am Sa., 21. März 2020 um 13:53 Uhr schrieb huxiaoyu@xxxxxxxxxxxx <
> huxiaoyu@xxxxxxxxxxxx>:
>
> > Hello， Martin，
> >
> > I notice that Croit advocate the use of ceph cluster without OS disks,
> > but with PXE boot.
> >
> > Do you use a NFS server to serve the root file system for each node?
> > such as hosting configuration files, user and password, log files,
> > etc. My question is, will the NFS server be a single point of failure?
> > If the NFS server goes down, the network experience any outage, ceph
> > nodes may not be able to write to the local file systems, possibly
> leading to service outage.
> >
> > How do you deal with the above potential issues in production? I am a
> > bit worried...
> >
> > best regards,
> >
> > samuel
> >
> >
> >
> >
> > ------------------------------
> > huxiaoyu@xxxxxxxxxxxx
> >
> >
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx