Re: HA and data recovery of CEPH

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/29/19 6:28 AM, jesper@xxxxxxxx wrote:
> Hi Nathan
> 
> Is that true?
> 
> The time it takes to reallocate the primary pg delivers “downtime” by
> design.  right? Seen from a writing clients perspective 
> 

That is true. When an OSD goes down it will take a few seconds for it's
Placement Groups to re-peer with the other OSDs. During that period
writes to those PGs will stall for a couple of seconds.

I wouldn't say it's 40s, but it can take ~10s.

This is however by design. Consistency of data has a higher priority
than availability inside Ceph.

'Nothing in this world is for free'. Keep that in mind.

Wido

> Jesper
> 
> 
> 
> Sent from myMail for iOS
> 
> 
> Friday, 29 November 2019, 06.24 +0100 from pengbo@xxxxxxxxxxx
> <pengbo@xxxxxxxxxxx>:
> 
>     Hi Nathan, 
> 
>     Thanks for the help.
>     My colleague will provide more details.
> 
>     BR
> 
>     On Fri, Nov 29, 2019 at 12:57 PM Nathan Fish <lordcirth@xxxxxxxxx
>     <mailto:lordcirth@xxxxxxxxx>> wrote:
> 
>         If correctly configured, your cluster should have zero downtime
>         from a
>         single OSD or node failure. What is your crush map? Are you using
>         replica or EC? If your 'min_size' is not smaller than 'size',
>         then you
>         will lose availability.
> 
>         On Thu, Nov 28, 2019 at 10:50 PM Peng Bo <pengbo@xxxxxxxxxxx
>         <mailto:pengbo@xxxxxxxxxxx>> wrote:
>         >
>         > Hi all,
>         >
>         > We are working on use CEPH to build our HA system, the purpose
>         is the system should always provide service even a node of CEPH
>         is down or OSD is lost.
>         >
>         > Currently, as we practiced once a node/OSD is down, the CEPH
>         cluster needs to take about 40 seconds to sync data, our system
>         can't provide service during that.
>         >
>         > My questions:
>         >
>         > Does there have any way that we can reduce the data sync time?
>         > How can we let the CEPH keeps available once a node/OSD is down?
>         >
>         >
>         > BR
>         >
>         > --
>         > The modern Unified Communications provider
>         >
>         > https://www.portsip.com
>         > _______________________________________________
>         > ceph-users mailing list
>         > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>         > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
>     -- 
>     The modern Unified Communications provider
> 
>     https://www.portsip.com
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux