Re: [ceph-user ] HA and data recovery of CEPH

Romit Misra <romit.misra@xxxxxxxxxxxx> · Sat, 30 Nov 2019 11:28:29 +0530

Hi, Peng,
  There are certain more observation that might help you further.
If you are using multiple pools, separate the pools in terms of the crush mapping as well as if possible on the Hardware Hosted.
It is not mandated to have all the pools separated, but say pools whose loads increase proportionally to the Client Load should be exercised
A basic example is Separation of Index and Data
If there is a multisite in play, you would want to possibly have a separate logpool mapping.
When a host goes down, the number of PGS that go into peering is a direct proportion to the number of OSD, in turn pool hosted.
Separating the Pools, also give a better control on per pool recovery.
A single client Operation can be viewed as a DAG in term of requests on multiple pools. Any blocked operation on a specific pool, could slow down or block the entire request.
When you say the service down time, you need to figure what is the metric you are looking at. In Object Storage it is the HTTP Response code, in other clients like CephFS, or RBD, there would be some SLA that you would be trying to maintain BAU.
A faster way of speeding up peering is to set the "norebalance and nobackfill" flags, and let all the PGS move to "active + * " state. Post this unset the flags and let recovery proceed.
AFAIK, as long as the PGS are in "active + *" state, the IO is bound to serve,
In worse cases if your PG are taking longer time to move to active state, causing a service outage, you can try to set the min_size to 1, or a reduced number, so that the number of Peer exchanges at that particular instant reduce. Again this be a function of what tunebales you are for configuration of the ruleset.
What is said in Point 9, is only applicable for Replicated Pool.
There are certain recovery tunebables as well viz: -osd_recovery_max_active, osd_recovery_max_chunk, osd_max_push_objects, osd_max_backfill osd_recovery_max_single_start
The tunebales mentioned above themselves control on the recovery throttles
It has been observed that During Peering the OSD memory and CPU uses go high. You might want to double check if you are saturating on any of the compute or NW resources, causing longer recovery times.
DO check your kernel tuneables as well, and make sure they are tuned to optimal settings
The points I mentioned above are general practices that I have learnt, some of them may be applicable, some may not depending on your overall infrastructure and deployment
Hope this helps

Thanks
Romit Misra

On Sat, Nov 30, 2019 at 2:31 AM <ceph-users-request@xxxxxxxxxxxxxx> wrote:
Send ceph-users mailing list submissions to

        ceph-users@xxxxxxxxxxxxxx

To subscribe or unsubscribe via the World Wide Web, visit

        http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

or, via email, send a message with subject or body 'help' to

        ceph-users-request@xxxxxxxxxxxxxx

You can reach the person managing the list at

        ceph-users-owner@xxxxxxxxxxxxxx

When replying, please edit your Subject line so it is more specific

than "Re: Contents of ceph-users digest..."

Today's Topics:

   1. HA and data recovery of CEPH (Peng Bo)

   2. Re: HA and data recovery of CEPH (Nathan Fish)

   3. Re: HA and data recovery of CEPH (Peng Bo)

   4. Re: HA and data recovery of CEPH (jesper@xxxxxxxx)

   5. Re: HA and data recovery of CEPH (hfx@xxxxxxxxxx)

   6. Re: HA and data recovery of CEPH (Wido den Hollander)

   7.  Can I add existing rgw users to a tenant (Wei Zhao)

   8. Re: scrub errors on rgw data pool (M Ranga Swami Reddy)

----------------------------------------------------------------------

Message: 1

Date: Fri, 29 Nov 2019 11:50:20 +0800

From: Peng Bo <pengbo@xxxxxxxxxxx>

To: ceph-users@xxxxxxxxxxxxxx

Subject:  HA and data recovery of CEPH

Message-ID:

        <CABJnkZ9gaQkEpVntdb-ttSX_EbLLPzoOKsFD1gCw0dH7F3pvfw@xxxxxxxxxxxxxx>

Content-Type: text/plain; charset="utf-8"

Hi all,

We are working on use CEPH to build our HA system, the purpose is the

system should always provide service even a node of CEPH is down or OSD is

lost.

Currently, as we practiced once a node/OSD is down, the CEPH cluster needs

to take about 40 seconds to sync data, our system can't provide service

during that.

My questions:

   - Does there have any way that we can reduce the data sync time?

   - How can we let the CEPH keeps available once a node/OSD is down?

BR

-- 

The modern Unified Communications provider

https://www.portsip.com

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20191129/66c88d83/attachment-0001.html>

------------------------------

Message: 2

Date: Thu, 28 Nov 2019 23:57:24 -0500

From: Nathan Fish <lordcirth@xxxxxxxxx>

To: Peng Bo <pengbo@xxxxxxxxxxx>

Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx>

Subject: Re:  HA and data recovery of CEPH

Message-ID:

        <CAKJgeVa8OtV-5x6Mquk7XLPJ+hdW6=jjdRsXNfTNckXPnf-MtA@xxxxxxxxxxxxxx>

Content-Type: text/plain; charset="UTF-8"

If correctly configured, your cluster should have zero downtime from a

single OSD or node failure. What is your crush map? Are you using

replica or EC? If your 'min_size' is not smaller than 'size', then you

will lose availability.

On Thu, Nov 28, 2019 at 10:50 PM Peng Bo <pengbo@xxxxxxxxxxx> wrote:

>

> Hi all,

>

> We are working on use CEPH to build our HA system, the purpose is the system should always provide service even a node of CEPH is down or OSD is lost.

>

> Currently, as we practiced once a node/OSD is down, the CEPH cluster needs to take about 40 seconds to sync data, our system can't provide service during that.

>

> My questions:

>

> Does there have any way that we can reduce the data sync time?

> How can we let the CEPH keeps available once a node/OSD is down?

>

>

> BR

>

> --

> The modern Unified Communications provider

>

> https://www.portsip.com

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

------------------------------

Message: 3

Date: Fri, 29 Nov 2019 13:21:44 +0800

From: Peng Bo <pengbo@xxxxxxxxxxx>

To: Nathan Fish <lordcirth@xxxxxxxxx>

Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx>, hfx@xxxxxxxxxx

Subject: Re:  HA and data recovery of CEPH

Message-ID:

        <CABJnkZ_Q-TTfKxfLNP2i_QkxkSDYCov+uYVQg31mwGeWvEm93A@xxxxxxxxxxxxxx>

Content-Type: text/plain; charset="utf-8"

Hi Nathan,

Thanks for the help.

My colleague will provide more details.

BR

On Fri, Nov 29, 2019 at 12:57 PM Nathan Fish <lordcirth@xxxxxxxxx> wrote:

> If correctly configured, your cluster should have zero downtime from a

> single OSD or node failure. What is your crush map? Are you using

> replica or EC? If your 'min_size' is not smaller than 'size', then you

> will lose availability.

>

> On Thu, Nov 28, 2019 at 10:50 PM Peng Bo <pengbo@xxxxxxxxxxx> wrote:

> >

> > Hi all,

> >

> > We are working on use CEPH to build our HA system, the purpose is the

> system should always provide service even a node of CEPH is down or OSD is

> lost.

> >

> > Currently, as we practiced once a node/OSD is down, the CEPH cluster

> needs to take about 40 seconds to sync data, our system can't provide

> service during that.

> >

> > My questions:

> >

> > Does there have any way that we can reduce the data sync time?

> > How can we let the CEPH keeps available once a node/OSD is down?

> >

> >

> > BR

> >

> > --

> > The modern Unified Communications provider

> >

> > https://www.portsip.com

> > _______________________________________________

> > ceph-users mailing list

> > ceph-users@xxxxxxxxxxxxxx

> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

-- 

The modern Unified Communications provider

https://www.portsip.com

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20191129/d714eeb8/attachment-0001.html>

------------------------------

Message: 4

Date: Fri, 29 Nov 2019 08:28:31 +0300

From: jesper@xxxxxxxx

To: Peng Bo <pengbo@xxxxxxxxxxx>

Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx>, hfx@xxxxxxxxxx, Nathan

        Fish <lordcirth@xxxxxxxxx>

Subject: Re:  HA and data recovery of CEPH

Message-ID: <1575005311.764819203@xxxxxxxxxx>

Content-Type: text/plain; charset="utf-8"

Hi Nathan

Is that true?

The time it takes to reallocate the primary pg delivers ?downtime? by design. ?right? Seen from a writing clients perspective?

Jesper

Sent from myMail for iOS

Friday, 29 November 2019, 06.24 +0100 from pengbo@xxxxxxxxxxx  <pengbo@xxxxxxxxxxx>:

>Hi Nathan,?

>

>Thanks for the help.

>My colleague will provide more details.

>

>BR

>On Fri, Nov 29, 2019 at 12:57 PM Nathan Fish < lordcirth@xxxxxxxxx > wrote:

>>If correctly configured, your cluster should have zero downtime from a

>>single OSD or node failure. What is your crush map? Are you using

>>replica or EC? If your 'min_size' is not smaller than 'size', then you

>>will lose availability.

>>

>>On Thu, Nov 28, 2019 at 10:50 PM Peng Bo < pengbo@xxxxxxxxxxx > wrote:

>>>

>>> Hi all,

>>>

>>> We are working on use CEPH to build our HA system, the purpose is the system should always provide service even a node of CEPH is down or OSD is lost.

>>>

>>> Currently, as we practiced once a node/OSD is down, the CEPH cluster needs to take about 40 seconds to sync data, our system can't provide service during that.

>>>

>>> My questions:

>>>

>>> Does there have any way that we can reduce the data sync time?

>>> How can we let the CEPH keeps available once a node/OSD is down?

>>>

>>>

>>> BR

>>>

>>> --

>>> The modern Unified Communications provider

>>>

>>>  https://www.portsip.com

>>> _______________________________________________

>>> ceph-users mailing list

>>>  ceph-users@xxxxxxxxxxxxxx

>>>  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

>

>-- 

>The modern Unified Communications provider

>

>https://www.portsip.com

>_______________________________________________

>ceph-users mailing list

>ceph-users@xxxxxxxxxxxxxx

>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20191129/aa5f2242/attachment-0001.html>

------------------------------

Message: 5

Date: Fri, 29 Nov 2019 14:23:01 +0800

From: "hfx@xxxxxxxxxx" <hfx@xxxxxxxxxx>

To: jesper <jesper@xxxxxxxx>,   "Peng Bo" <pengbo@xxxxxxxxxxx>

Cc: "Ceph Users" <ceph-users@xxxxxxxxxxxxxx>,  "Nathan Fish"

        <lordcirth@xxxxxxxxx>

Subject: Re:  HA and data recovery of CEPH

Message-ID: <2019112914230082973160@xxxxxxxxxx>+B11E2F5EFD0724AA

Content-Type: text/plain; charset="utf-8"

Hi Nathan

We build a ceph cluster with 3 nodes.

node-3: osd-2, mon-b, 

node-4: osd-0, mon-a, mds-myfs-a, mgr

node-5: osd-1, mon-c, mds-myfs-b

ceph cluster created by rook.

Test phenomenon

After one node unusual down(like direct poweroff), try to mount cephfs volume will spend more than 40 seconds. 

Normally Ceph Cluster Status:

$ ceph status      

  cluster:

    id:     776b5432-be9c-455f-bb2e-05cbf20d6f6a

    health: HEALTH_OK

  services:

    mon: 3 daemons, quorum a,b,c (age 20h)

    mgr: a(active, since 21h)

    mds: myfs:1 {0=myfs-a=up:active} 1 up:standby

    osd: 3 osds: 3 up (since 20h), 3 in (since 21h)

  data:

    pools:   2 pools, 136 pgs

    objects: 2.59k objects, 330 MiB

    usage:   25 GiB used, 125 GiB / 150 GiB avail

    pgs:     136 active+clean

  io:

    client:   1.5 KiB/s wr, 0 op/s rd, 0 op/s wr

Normally CephFS Status:

$ ceph fs status

myfs - 3 clients

====

+------+--------+--------+---------------+-------+-------+

| Rank | State  |  MDS   |    Activity   |  dns  |  inos |

+------+--------+--------+---------------+-------+-------+

|  0   | active | myfs-a | Reqs:    0 /s | 2250  | 2059  |

+------+--------+--------+---------------+-------+-------+

+---------------+----------+-------+-------+

|      Pool     |   type   |  used | avail |

+---------------+----------+-------+-------+

| myfs-metadata | metadata |  208M | 39.1G |

|   myfs-data0  |   data   |  121M | 39.1G |

+---------------+----------+-------+-------+

+-------------+

| Standby MDS |

+-------------+

|    myfs-b   |

+-------------+

MDS version: ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable)

Are you using replica or EC?  

            => Not used EC

'min_size' is not smaller than 'size'?

$ ceph osd dump | grep pool

pool 1 'myfs-metadata' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 16 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs

pool 2 'myfs-data0' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode warn last_change 141 lfor 0/0/53 flags hashpspool stripe_width 0 application cephfs

What is your crush map? 

$ ceph osd crush dump

{

    "devices": [

        {

            "id": 0,

            "name": "osd.0",

            "class": "hdd"

        },

        {

            "id": 1,

            "name": "osd.1",

            "class": "hdd"

        },

        {

            "id": 2,

            "name": "osd.2",

            "class": "hdd"

        }

    ],

    "types": [

        {

            "type_id": 0,

            "name": "osd"

        },

        {

            "type_id": 1,

            "name": "host"

        },

        {

            "type_id": 2,

            "name": "chassis"

        },

        {

            "type_id": 3,

            "name": "rack"

        },

        {

            "type_id": 4,

            "name": "row"

        },

        {

            "type_id": 5,

            "name": "pdu"

        },

        {

            "type_id": 6,

            "name": "pod"

        },

        {

            "type_id": 7,

            "name": "room"

        },

        {

            "type_id": 8,

            "name": "datacenter"

        },

        {

            "type_id": 9,

            "name": "zone"

        },

        {

            "type_id": 10,

            "name": "region"

        },

        {

            "type_id": 11,

            "name": "root"

        }

    ],

    "buckets": [

        {

            "id": -1,

            "name": "default",

            "type_id": 11,

            "type_name": "root",

            "weight": 9594,

            "alg": "straw2",

            "hash": "rjenkins1",

            "items": [

                {

                    "id": -3,

                    "weight": 3198,

                    "pos": 0

                },

                {

                    "id": -5,

                    "weight": 3198,

                    "pos": 1

                },

                {

                    "id": -7,

                    "weight": 3198,

                    "pos": 2

                }

            ]

        },

        {

            "id": -2,

            "name": "default~hdd",

            "type_id": 11,

            "type_name": "root",

            "weight": 9594,

            "alg": "straw2",

            "hash": "rjenkins1",

            "items": [

                {

                    "id": -4,

                    "weight": 3198,

                    "pos": 0

                },

                {

                    "id": -6,

                    "weight": 3198,

                    "pos": 1

                },

                {

                    "id": -8,

                    "weight": 3198,

                    "pos": 2

                }

            ]

        },

        {

            "id": -3,

            "name": "node-4",

            "type_id": 1,

            "type_name": "host",

            "weight": 3198,

            "alg": "straw2",

            "hash": "rjenkins1",

            "items": [

                {

                    "id": 0,

                    "weight": 3198,

                    "pos": 0

                }

            ]

        },

        {

            "id": -4,

            "name": "node-4~hdd",

            "type_id": 1,

            "type_name": "host",

            "weight": 3198,

            "alg": "straw2",

            "hash": "rjenkins1",

            "items": [

                {

                    "id": 0,

                    "weight": 3198,

                    "pos": 0

                }

            ]

        },

        {

            "id": -5,

            "name": "node-5",

            "type_id": 1,

            "type_name": "host",

            "weight": 3198,

            "alg": "straw2",

            "hash": "rjenkins1",

            "items": [

                {

                    "id": 1,

                    "weight": 3198,

                    "pos": 0

                }

            ]

        },

        {

            "id": -6,

            "name": "node-5~hdd",

            "type_id": 1,

            "type_name": "host",

            "weight": 3198,

            "alg": "straw2",

            "hash": "rjenkins1",

            "items": [

                {

                    "id": 1,

                    "weight": 3198,

                    "pos": 0

                }

            ]

        },

        {

            "id": -7,

            "name": "node-3",

            "type_id": 1,

            "type_name": "host",

            "weight": 3198,

            "alg": "straw2",

            "hash": "rjenkins1",

            "items": [

                {

                    "id": 2,

                    "weight": 3198,

                    "pos": 0

                }

            ]

        },

        {

            "id": -8,

            "name": "node-3~hdd",

            "type_id": 1,

            "type_name": "host",

            "weight": 3198,

            "alg": "straw2",

            "hash": "rjenkins1",

            "items": [

                {

                    "id": 2,

                    "weight": 3198,

                    "pos": 0

                }

            ]

        }

    ],

    "rules": [

        {

            "rule_id": 0,

            "rule_name": "replicated_rule",

            "ruleset": 0,

            "type": 1,

            "min_size": 1,

            "max_size": 10,

            "steps": [

                {

                    "op": "take",

                    "item": -1,

                    "item_name": "default"

                },

                {

                    "op": "chooseleaf_firstn",

                    "num": 0,

                    "type": "host"

                },

                {

                    "op": "emit"

                }

            ]

        },

        {

            "rule_id": 1,

            "rule_name": "myfs-metadata",

            "ruleset": 1,

            "type": 1,

            "min_size": 1,

            "max_size": 10,

            "steps": [

                {

                    "op": "take",

                    "item": -1,

                    "item_name": "default"

                },

                {

                    "op": "chooseleaf_firstn",

                    "num": 0,

                    "type": "host"

                },

                {

                    "op": "emit"

                }

            ]

        },

        {

            "rule_id": 2,

            "rule_name": "myfs-data0",

            "ruleset": 2,

            "type": 1,

            "min_size": 1,

            "max_size": 10,

            "steps": [

                {

                    "op": "take",

                    "item": -1,

                    "item_name": "default"

                },

                {

                    "op": "chooseleaf_firstn",

                    "num": 0,

                    "type": "host"

                },

                {

                    "op": "emit"

                }

            ]

        }

    ],

    "tunables": {

        "choose_local_tries": 0,

        "choose_local_fallback_tries": 0,

        "choose_total_tries": 50,

        "chooseleaf_descend_once": 1,

        "chooseleaf_vary_r": 1,

        "chooseleaf_stable": 1,

        "straw_calc_version": 1,

        "allowed_bucket_algs": 54,

        "profile": "jewel",

        "optimal_tunables": 1,

        "legacy_tunables": 0,

        "minimum_required_version": "jewel",

        "require_feature_tunables": 1,

        "require_feature_tunables2": 1,

        "has_v2_rules": 0,

        "require_feature_tunables3": 1,

        "has_v3_rules": 0,

        "has_v4_buckets": 1,

        "require_feature_tunables5": 1,

        "has_v5_rules": 0

    },

    "choose_args": {}

}

Question

How can i mount CephFS volumn as soon as possible, after one node unusual down.Any ceph cluster(filesystem) configuration suggestion? Using EC?

Best Regards

hfx@xxxxxxxxxx

From: jesper

Date: 2019-11-29 13:28

To: Peng Bo

CC: Ceph Users; hfx; Nathan Fish

Subject: Re[2]:  HA and data recovery of CEPH

Hi Nathan

Is that true?

The time it takes to reallocate the primary pg delivers ?downtime? by design.  right? Seen from a writing clients perspective 

Jesper

Sent from myMail for iOS

Friday, 29 November 2019, 06.24 +0100 from pengbo@xxxxxxxxxxx <pengbo@xxxxxxxxxxx>:

Hi Nathan, 

Thanks for the help.

My colleague will provide more details.

BR

On Fri, Nov 29, 2019 at 12:57 PM Nathan Fish <lordcirth@xxxxxxxxx> wrote:

If correctly configured, your cluster should have zero downtime from a

single OSD or node failure. What is your crush map? Are you using

replica or EC? If your 'min_size' is not smaller than 'size', then you

will lose availability.

On Thu, Nov 28, 2019 at 10:50 PM Peng Bo <pengbo@xxxxxxxxxxx> wrote:

>

> Hi all,

>

> We are working on use CEPH to build our HA system, the purpose is the system should always provide service even a node of CEPH is down or OSD is lost.

>

> Currently, as we practiced once a node/OSD is down, the CEPH cluster needs to take about 40 seconds to sync data, our system can't provide service during that.

>

> My questions:

>

> Does there have any way that we can reduce the data sync time?

> How can we let the CEPH keeps available once a node/OSD is down?

>

>

> BR

>

> --

> The modern Unified Communications provider

>

> https://www.portsip.com

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 

The modern Unified Communications provider

https://www.portsip.com

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20191129/55eda1e6/attachment-0001.html>

------------------------------

Message: 6

Date: Fri, 29 Nov 2019 08:29:48 +0100

From: Wido den Hollander <wido@xxxxxxxx>

To: jesper@xxxxxxxx, Peng Bo <pengbo@xxxxxxxxxxx>

Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx>, hfx@xxxxxxxxxx

Subject: Re:  HA and data recovery of CEPH

Message-ID: <7724520c-8659-bb86-050c-a1f90f7be79a@xxxxxxxx>

Content-Type: text/plain; charset=utf-8

On 11/29/19 6:28 AM, jesper@xxxxxxxx wrote:

> Hi Nathan

> 

> Is that true?

> 

> The time it takes to reallocate the primary pg delivers ?downtime? by

> design. ?right? Seen from a writing clients perspective?

> 

That is true. When an OSD goes down it will take a few seconds for it's

Placement Groups to re-peer with the other OSDs. During that period

writes to those PGs will stall for a couple of seconds.

I wouldn't say it's 40s, but it can take ~10s.

This is however by design. Consistency of data has a higher priority

than availability inside Ceph.

'Nothing in this world is for free'. Keep that in mind.

Wido

> Jesper

> 

> 

> 

> Sent from myMail for iOS

> 

> 

> Friday, 29 November 2019, 06.24 +0100 from pengbo@xxxxxxxxxxx

> <pengbo@xxxxxxxxxxx>:

> 

>     Hi Nathan,?

> 

>     Thanks for the help.

>     My colleague will provide more details.

> 

>     BR

> 

>     On Fri, Nov 29, 2019 at 12:57 PM Nathan Fish <lordcirth@xxxxxxxxx

>     <mailto:lordcirth@xxxxxxxxx>> wrote:

> 

>         If correctly configured, your cluster should have zero downtime

>         from a

>         single OSD or node failure. What is your crush map? Are you using

>         replica or EC? If your 'min_size' is not smaller than 'size',

>         then you

>         will lose availability.

> 

>         On Thu, Nov 28, 2019 at 10:50 PM Peng Bo <pengbo@xxxxxxxxxxx

>         <mailto:pengbo@xxxxxxxxxxx>> wrote:

>         >

>         > Hi all,

>         >

>         > We are working on use CEPH to build our HA system, the purpose

>         is the system should always provide service even a node of CEPH

>         is down or OSD is lost.

>         >

>         > Currently, as we practiced once a node/OSD is down, the CEPH

>         cluster needs to take about 40 seconds to sync data, our system

>         can't provide service during that.

>         >

>         > My questions:

>         >

>         > Does there have any way that we can reduce the data sync time?

>         > How can we let the CEPH keeps available once a node/OSD is down?

>         >

>         >

>         > BR

>         >

>         > --

>         > The modern Unified Communications provider

>         >

>         > https://www.portsip.com

>         > _______________________________________________

>         > ceph-users mailing list

>         > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>

>         > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

> 

> 

> 

>     -- 

>     The modern Unified Communications provider

> 

>     https://www.portsip.com

>     _______________________________________________

>     ceph-users mailing list

>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>

>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

> 

> 

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

> 

------------------------------

Message: 7

Date: Fri, 29 Nov 2019 16:26:27 +0800

From: Wei Zhao <zhao6305@xxxxxxxxx>

To: Ceph Users <ceph-users@xxxxxxxxxxxxxx>

Subject:   Can I add existing rgw users to a tenant

Message-ID:

        <CAGOEmcNfHZCqC15c9fQE-SM3KHn9WXv+LQ7ptTS2hL7XGqsPGw@xxxxxxxxxxxxxx>

Content-Type: text/plain; charset="UTF-8"

Hello:

    We want to use rgw tenant as a  group. But Can I add  existing rgw

users to a new tenant ?

------------------------------

Message: 8

Date: Fri, 29 Nov 2019 15:45:40 +0530

From: M Ranga Swami Reddy <swamireddy@xxxxxxxxx>

To: ceph-users <ceph-users@xxxxxxxxxxxxxx>, ceph-devel

        <ceph-devel@xxxxxxxxxxxxxxx>

Subject: Re:  scrub errors on rgw data pool

Message-ID:

        <CANA9Uk75OULZLAJbWqh+1sd5r5D4LMPrej-gZYUUagDbEZU2HQ@xxxxxxxxxxxxxx>

Content-Type: text/plain; charset="utf-8"

Primary OSD crashes with below assert:

12.2.11/src/osd/ReplicatedBackend.cc:1445 assert(peer_missing.count(

fromshard))

==

here I have 2 OSDs with bluestore backend and 1 osd with filestore backend.

On Mon, Nov 25, 2019 at 3:34 PM M Ranga Swami Reddy <swamireddy@xxxxxxxxx>

wrote:

> Hello - We are using the ceph 12.2.11 version (upgraded from Jewel 10.2.12

> to 12.2.11). In this cluster, we are having mix of filestore and bluestore

> OSD backends.

> Recently we are seeing the scrub errors on rgw buckets.data pool every

> day, after scrub operation performed by Ceph. If we run the PG repair, the

> errors will go way.

>

> Anyone seen the above issue?

> Is the mix of filestore backend has bug/issue with 12.2.11 version (ie

> Luminous).

> Is the mix of filestore and bluestore OSDs cause this type of issue?

>

> Thanks

> Swami

>

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20191129/ffa82f99/attachment-0001.html>

------------------------------

Subject: Digest Footer

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

------------------------------

End of ceph-users Digest, Vol 82, Issue 26

******************************************

-----------------------------------------------------------------------------------------
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee, you should not disseminate, distribute or copy this email. Please notify the sender immediately by email if you have received this email by mistake and delete this email from your system. If you are not the intended recipient, you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.

Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the organization. Any information on shares, debentures or similar instruments, recommended product pricing, valuations and the like are for information purposes only. It is not meant to be an instruction or recommendation, as the case may be, to buy or to sell securities, products, services nor an offer to buy or sell securities, products or services unless specifically stated to be so on behalf of the Flipkart group. Employees of the Flipkart group of companies are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to organizational policy and outside the scope of the employment of the individual concerned. The organization will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising.

Our organization accepts no liability for the content of this email, or for the consequences of any actions taken on the basis of the information provided, unless that information is subsequently confirmed in writing. If you are not the intended recipient, you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
-----------------------------------------------------------------------------------------
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com