Re: EC pool creation results in incorrect M value?

Geoffrey Rhodes <geoffrey@xxxxxxxxxxxxx> · Tue, 28 Jan 2020 10:54:50 +0200

Hi Eric,

With regards to "From the output of “ceph osd pool ls detail” you can see
min_size=4, the crush rule says min_size=3 however the pool does NOT
survive 2 hosts failing.  Am I missing something?"

For your EC profile you need to set the pool min_size=3 to still read/write
to the pool with two host failures.
RUN:  sudo ceph osd pool set ec32pool min_size 3

Kind regards
Geoffrey Rhodes

On Mon, 27 Jan 2020 at 22:11, <ceph-users-request@xxxxxxx> wrote:

> Send ceph-users mailing list submissions to
>         ceph-users@xxxxxxx
>
> To subscribe or unsubscribe via email, send a message with subject or
> body 'help' to
>         ceph-users-request@xxxxxxx
>
> You can reach the person managing the list at
>         ceph-users-owner@xxxxxxx
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of ceph-users digest..."
>
> Today's Topics:
>
>    1. Re: EC pool creation results in incorrect M value? (Paul Emmerich)
>    2. Re: EC pool creation results in incorrect M value? (Smith, Eric)
>    3. Re: EC pool creation results in incorrect M value? (Smith, Eric)
>    4. data loss on full file system? (Håkan T Johansson)
>
>
> ----------------------------------------------------------------------
>
> Date: Mon, 27 Jan 2020 17:14:55 +0100
> From: Paul Emmerich <paul.emmerich@xxxxxxxx>
> Subject:  Re: EC pool creation results in incorrect M
>         value?
> To: "Smith, Eric" <Eric.Smith@xxxxxxxx>
> Cc: "ceph-users@xxxxxxx" <ceph-users@xxxxxxx>
> Message-ID:
>         <
> CAD9yTbFb28FX_XaNXRUUxQ1UqAYa_ouhO_fkE+Vbts1VXGkuDw@xxxxxxxxxxxxxx>
> Content-Type: text/plain; charset="UTF-8"
>
> min_size in the crush rule and min_size in the pool are completely
> different things that happen to share the same name.
>
> Ignore min_size in the crush rule, it has virtually no meaning in
> almost all cases (like this one).
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Mon, Jan 27, 2020 at 3:41 PM Smith, Eric <Eric.Smith@xxxxxxxx> wrote:
> >
> > I have a Ceph Luminous (12.2.12) cluster with 6 nodes. I’m attempting to
> create an EC3+2 pool with the following commands:
> >
> > Create the EC profile:
> >
> > ceph osd erasure-code-profile set es32 k=3 m=2 plugin=jerasure w=8
> technique=reed_sol_van crush-failure-domain=host crush-root=sgshared
> >
> > Verify profile creation:
> >
> > [root@mon-1 ~]# ceph osd erasure-code-profile get es32
> >
> > crush-device-class=
> >
> > crush-failure-domain=host
> >
> > crush-root=sgshared
> >
> > jerasure-per-chunk-alignment=false
> >
> > k=3
> >
> > m=2
> >
> > plugin=jerasure
> >
> > technique=reed_sol_van
> >
> > w=8
> >
> > Create a pool using this profile:
> >
> > ceph osd pool create ec32pool 1024 1024 erasure es32
> >
> > List pool detail:
> >
> > pool 31 'es32' erasure size 5 min_size 4 crush_rule 11 object_hash
> rjenkins pg_num 1024 pgp_num 1024 last_change 1568 flags hashpspool
> stripe_width 12288 application ES
> >
> > Here’s the crush rule that’s created:
> >     {
> >
> >         "rule_id": 11,
> >
> >         "rule_name": "es32",
> >
> >         "ruleset": 11,
> >
> >         "type": 3,
> >
> >         "min_size": 3,
> >
> >         "max_size": 5,
> >
> >         "steps": [
> >
> >             {
> >
> >                 "op": "set_chooseleaf_tries",
> >
> >                 "num": 5
> >
> >             },
> >
> >             {
> >
> >                 "op": "set_choose_tries",
> >
> >                 "num": 100
> >
> >             },
> >
> >             {
> >
> >                 "op": "take",
> >
> >                 "item": -2,
> >
> >                 "item_name": "sgshared"
> >
> >             },
> >
> >             {
> >
> >                 "op": "chooseleaf_indep",
> >
> >                 "num": 0,
> >
> >                 "type": "host"
> >
> >             },
> >
> >             {
> >
> >                 "op": "emit"
> >
> >             }
> >
> >         ]
> >
> >     },
> >
> >
> >
> > From the output of “ceph osd pool ls detail” you can see min_size=4, the
> crush rule says min_size=3 however the pool does NOT survive 2 hosts
> failing.
> >
> >
> >
> > Am I missing something?
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
> ------------------------------
>
> Date: Mon, 27 Jan 2020 16:22:06 +0000
> From: "Smith, Eric" <Eric.Smith@xxxxxxxx>
> Subject:  Re: EC pool creation results in incorrect M
>         value?
> To: Paul Emmerich <paul.emmerich@xxxxxxxx>
> Cc: "ceph-users@xxxxxxx" <ceph-users@xxxxxxx>
> Message-ID:  <BN8PR14MB28206D6D29507FD9499DEE91EA0B0@xxxxxxxxxxxxxxxxx
>         prd14.prod.outlook.com>
> Content-Type: text/plain; charset="utf-8"
>
> Thanks for the info regarding min_size in the crush rule - does this seem
> like a bug to you then? Is anyone else able to reproduce this?
>
> -----Original Message-----
> From: Paul Emmerich <paul.emmerich@xxxxxxxx>
> Sent: Monday, January 27, 2020 11:15 AM
> To: Smith, Eric <Eric.Smith@xxxxxxxx>
> Cc: ceph-users@xxxxxxx
> Subject: Re:  EC pool creation results in incorrect M value?
>
> min_size in the crush rule and min_size in the pool are completely
> different things that happen to share the same name.
>
> Ignore min_size in the crush rule, it has virtually no meaning in almost
> all cases (like this one).
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Mon, Jan 27, 2020 at 3:41 PM Smith, Eric <Eric.Smith@xxxxxxxx> wrote:
> >
> > I have a Ceph Luminous (12.2.12) cluster with 6 nodes. I’m attempting to
> create an EC3+2 pool with the following commands:
> >
> > Create the EC profile:
> >
> > ceph osd erasure-code-profile set es32 k=3 m=2 plugin=jerasure w=8
> > technique=reed_sol_van crush-failure-domain=host crush-root=sgshared
> >
> > Verify profile creation:
> >
> > [root@mon-1 ~]# ceph osd erasure-code-profile get es32
> >
> > crush-device-class=
> >
> > crush-failure-domain=host
> >
> > crush-root=sgshared
> >
> > jerasure-per-chunk-alignment=false
> >
> > k=3
> >
> > m=2
> >
> > plugin=jerasure
> >
> > technique=reed_sol_van
> >
> > w=8
> >
> > Create a pool using this profile:
> >
> > ceph osd pool create ec32pool 1024 1024 erasure es32
> >
> > List pool detail:
> >
> > pool 31 'es32' erasure size 5 min_size 4 crush_rule 11 object_hash
> > rjenkins pg_num 1024 pgp_num 1024 last_change 1568 flags hashpspool
> > stripe_width 12288 application ES
> >
> > Here’s the crush rule that’s created:
> >     {
> >
> >         "rule_id": 11,
> >
> >         "rule_name": "es32",
> >
> >         "ruleset": 11,
> >
> >         "type": 3,
> >
> >         "min_size": 3,
> >
> >         "max_size": 5,
> >
> >         "steps": [
> >
> >             {
> >
> >                 "op": "set_chooseleaf_tries",
> >
> >                 "num": 5
> >
> >             },
> >
> >             {
> >
> >                 "op": "set_choose_tries",
> >
> >                 "num": 100
> >
> >             },
> >
> >             {
> >
> >                 "op": "take",
> >
> >                 "item": -2,
> >
> >                 "item_name": "sgshared"
> >
> >             },
> >
> >             {
> >
> >                 "op": "chooseleaf_indep",
> >
> >                 "num": 0,
> >
> >                 "type": "host"
> >
> >             },
> >
> >             {
> >
> >                 "op": "emit"
> >
> >             }
> >
> >         ]
> >
> >     },
> >
> >
> >
> > From the output of “ceph osd pool ls detail” you can see min_size=4, the
> crush rule says min_size=3 however the pool does NOT survive 2 hosts
> failing.
> >
> >
> >
> > Am I missing something?
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> > email to ceph-users-leave@xxxxxxx
>
> ------------------------------
>
> Date: Mon, 27 Jan 2020 16:45:52 +0000
> From: "Smith, Eric" <Eric.Smith@xxxxxxxx>
> Subject:  Re: EC pool creation results in incorrect M
>         value?
> To: "Smith, Eric" <Eric.Smith@xxxxxxxx>, Paul Emmerich
>         <paul.emmerich@xxxxxxxx>
> Cc: "ceph-users@xxxxxxx" <ceph-users@xxxxxxx>
> Message-ID:  <BN8PR14MB2820B1DF71DE395B7B5AA1ECEA0B0@xxxxxxxxxxxxxxxxx
>         prd14.prod.outlook.com>
> Content-Type: text/plain; charset="utf-8"
>
> OK I see this: https://github.com/ceph/ceph/pull/8008
>
> Perhaps it's just to be safe...
>
> -----Original Message-----
> From: Smith, Eric <Eric.Smith@xxxxxxxx>
> Sent: Monday, January 27, 2020 11:22 AM
> To: Paul Emmerich <paul.emmerich@xxxxxxxx>
> Cc: ceph-users@xxxxxxx
> Subject:  Re: EC pool creation results in incorrect M value?
>
> Thanks for the info regarding min_size in the crush rule - does this seem
> like a bug to you then? Is anyone else able to reproduce this?
>
> -----Original Message-----
> From: Paul Emmerich <paul.emmerich@xxxxxxxx>
> Sent: Monday, January 27, 2020 11:15 AM
> To: Smith, Eric <Eric.Smith@xxxxxxxx>
> Cc: ceph-users@xxxxxxx
> Subject: Re:  EC pool creation results in incorrect M value?
>
> min_size in the crush rule and min_size in the pool are completely
> different things that happen to share the same name.
>
> Ignore min_size in the crush rule, it has virtually no meaning in almost
> all cases (like this one).
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Mon, Jan 27, 2020 at 3:41 PM Smith, Eric <Eric.Smith@xxxxxxxx> wrote:
> >
> > I have a Ceph Luminous (12.2.12) cluster with 6 nodes. I’m attempting to
> create an EC3+2 pool with the following commands:
> >
> > Create the EC profile:
> >
> > ceph osd erasure-code-profile set es32 k=3 m=2 plugin=jerasure w=8
> > technique=reed_sol_van crush-failure-domain=host crush-root=sgshared
> >
> > Verify profile creation:
> >
> > [root@mon-1 ~]# ceph osd erasure-code-profile get es32
> >
> > crush-device-class=
> >
> > crush-failure-domain=host
> >
> > crush-root=sgshared
> >
> > jerasure-per-chunk-alignment=false
> >
> > k=3
> >
> > m=2
> >
> > plugin=jerasure
> >
> > technique=reed_sol_van
> >
> > w=8
> >
> > Create a pool using this profile:
> >
> > ceph osd pool create ec32pool 1024 1024 erasure es32
> >
> > List pool detail:
> >
> > pool 31 'es32' erasure size 5 min_size 4 crush_rule 11 object_hash
> > rjenkins pg_num 1024 pgp_num 1024 last_change 1568 flags hashpspool
> > stripe_width 12288 application ES
> >
> > Here’s the crush rule that’s created:
> >     {
> >
> >         "rule_id": 11,
> >
> >         "rule_name": "es32",
> >
> >         "ruleset": 11,
> >
> >         "type": 3,
> >
> >         "min_size": 3,
> >
> >         "max_size": 5,
> >
> >         "steps": [
> >
> >             {
> >
> >                 "op": "set_chooseleaf_tries",
> >
> >                 "num": 5
> >
> >             },
> >
> >             {
> >
> >                 "op": "set_choose_tries",
> >
> >                 "num": 100
> >
> >             },
> >
> >             {
> >
> >                 "op": "take",
> >
> >                 "item": -2,
> >
> >                 "item_name": "sgshared"
> >
> >             },
> >
> >             {
> >
> >                 "op": "chooseleaf_indep",
> >
> >                 "num": 0,
> >
> >                 "type": "host"
> >
> >             },
> >
> >             {
> >
> >                 "op": "emit"
> >
> >             }
> >
> >         ]
> >
> >     },
> >
> >
> >
> > From the output of “ceph osd pool ls detail” you can see min_size=4, the
> crush rule says min_size=3 however the pool does NOT survive 2 hosts
> failing.
> >
> >
> >
> > Am I missing something?
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> > email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> email to ceph-users-leave@xxxxxxx
>
> ------------------------------
>
> Date: Mon, 27 Jan 2020 21:10:10 +0100
> From: Håkan T Johansson <f96hajo@xxxxxxxxxxx>
> Subject:  data loss on full file system?
> To: <ceph-users@xxxxxxx>
> Message-ID:
>         <alpine.DEB.2.20.2001272106210.7767@xxxxxxxxxxxxxxxxxxxxxxx>
> Content-Type: text/plain; format=flowed; charset="UTF-8"
>
>
> Hi,
>
> for test purposes, I have set up two 100 GB OSDs, one
> taking a data pool and the other metadata pool for cephfs.
>
> Am running 14.2.6-1-gffd69200ad-1 with packages from
> https://mirror.croit.io/debian-nautilus
>
> Am then running a program that creates a lot of 1 MiB files by calling
>    fopen()
>    fwrite()
>    fclose()
> for each of them.  Error codes are checked.
>
> This works successfully for ~100 GB of data, and then strangely also
> succeeds
> for many more 100 GB of data...  ??
>
> All written files have size 1 MiB with 'ls', and thus should contain the
> data
> written.  However, on inspection, the files written after the first ~100
> GiB,
> are full of just 0s.  (hexdump -C)
>
>
> To further test this, I use the standard tool 'cp' to copy a few
> random-content
> files into the full cephfs filessystem.  cp reports no complaints, and
> after
> the copy operations, content is seen with hexdump -C.  However, after
> forcing
> the data out of cache on the client by reading other earlier created
> files,
> hexdump -C show all-0 content for the files copied with 'cp'.  Data that
> was
> there is suddenly gone...?
>
>
> I am new to ceph.  Is there an option I have missed to avoid this
> behaviour?
> (I could not find one in
> https://docs.ceph.com/docs/master/man/8/mount.ceph/ )
>
> Is this behaviour related to
> https://docs.ceph.com/docs/mimic/cephfs/full/
> ?
>
> (That page states 'sometime after a write call has already returned 0'.
> But if
> write returns 0, then no data has been written, so the user program would
> not
> assume any kind of success.)
>
> Best regards,
>
> Håkan
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>
>
> ------------------------------
>
> End of ceph-users Digest, Vol 84, Issue 44
> ******************************************
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx