Re: HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

Daniel Carrasco <d.carrasco@xxxxxxxxx> · Mon, 12 Jun 2017 11:45:41 +0200

2017-06-12 10:49 GMT+02:00 Burkhard Linke <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>:
Hi,

On 06/12/2017 10:31 AM, Daniel Carrasco wrote:

Hello,

I'm very new on Ceph, so maybe this question is a noob question.

We have an architecture that have some web servers (nginx, php...) with a common File Server through NFS. Of course that is a SPOF, so we want to create a multi FS to avoid future problems.

We've already tested GlusterFS, but is very slow reading small files using the oficial client (from 600ms to 1700ms to read the Doc page), and through NFS Ganesha it fails a lot (permissions errors, 404 when the file exists...).

The next we're trying is Ceph that looks very well and have a good performance even with small files (near to NFS performance: 90-100ms to 100-120ms), but on some tests that I've done, it stop working when an OSD is down.

My test architecture are two servers with one OSD and one MON each, and a third with a MON and an MDS. I've configured the cluster to have two copies of every PG (just like a RAID1) and all looks fine (health OK, three monitors...).

My test client also works fine: it connects to the cluster and is able to server the webpage without problems, but my problems comes when an OSD goes down. The cluster detects that is down, it shows like needs more OSD to keep the two copies, designates a new MON and looks like is working, but the client is unable to receive new files until I power on the OSD again (it happens with both OSD).

My question is: Is there any way to say Ceph to keep serving files even when an OSD is down?

I assume the data pool is configured with size=2 and min_size=2. This means that you need two active replicates to allow I/O to a PG. With one OSD down this requirement cannot be met.

You can either:

- add a third OSD

- set min_size=1 

The later might be fine for test setup, but do not run this configuration in production. NEVER. EVER. Search the mailing list for more details.

Thanks!! , just what I thought, a noob question hehe. Now is working.
I'll search later in list, but looks like is to avoid split brain or similar.

My other question is about MDS:

Multi-MDS enviorement is stable?, because if I have multiple FS to avoid SPOF and I only can deploy an MDS, then we have a new SPOF...

This is to know if maybe i need to use Block Devices pools instead File Server pools.

AFAIK active/active MDS setups are still considered experimental; active/standby(-replay) is a supported setup. We currently use one active and one standby-replay MDS for our CephFS instance serving several million files.

Failover between the MDS works, but might run into problems with a large number of open files (each requiring a stat operation). Depending on the number of open files failover takes some seconds up to 5-10 minutes in our setup.

Thanks again for your response,
Is not for performance purporse so an active/standby will be enough. I'll search about this configuration.

About time, always is better to keep the page down for some seconds instead wait for an admin to fix it.

Regards,

Burkhard Linke

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Greetings!!

-- 
_________________________________________

      Daniel Carrasco Marín
      Ingeniería para la Innovación i2TIC, S.L.
      Tlf:  +34 911 12 32 84 Ext: 223
      www.i2tic.com
_________________________________________

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com