Re: PGs inconsistent, do I fear data loss?

Denes Dolhay <denke@xxxxxxxxxxxx> · Fri, 3 Nov 2017 00:05:18 +0100



    Hi Greg,
    Accepting the fact, that an osd with outdated data can never
      accept write, or io of any kind, how is it possible, that the
      system goes into this state?
    -All osds are Bluestore, checksum, mtime etc.

    
    -All osds are up and in
    -No hw failures, lost disks, damaged journals or databases etc.

    
    -The data became inconsistent
    

    Thanks,

    
    Denke.

    
    On 11/02/2017 11:51 PM, Gregory Farnum
      wrote:

    
          On Thu, Nov 2, 2017 at 1:21 AM koukou73gr <koukou73gr@xxxxxxxxx>
            wrote:

          
          The
            scenario is actually a bit different, see:

            
            Let's assume size=2, min_size=1

            -We are looking at pg "A" acting [1, 2]

            -osd 1 goes down

            -osd 2 accepts a write for pg "A"

            -osd 2 goes down

            -osd 1 comes back up, while osd 2 still down

            -osd 1 has no way to know osd 2 accepted a write in pg "A"

            -osd 1 accepts a new write to pg "A"

            -osd 2 comes back up.

            
            bang! osd 1 and 2 now have different views of pg "A" but
            both claim to

            have current data.
          

          In this case, OSD 1 will not accept IO
            precisely because it can not prove it has the current data.
            That is the basic purpose of OSD peering and holds in all
            cases.
          -Greg
          

            -K.

            
            On 2017-11-01 20:27, Denes Dolhay wrote:

            > Hello,

            >

            > I have a trick question for Mr. Turner's scenario:

            > Let's assume size=2, min_size=1

            > -We are looking at pg "A" acting [1, 2]

            > -osd 1 goes down, OK

            > -osd 1 comes back up, backfill of pg "A" commences from
            osd 2 to osd 1, OK

            > -osd 2 goes down (and therefore pg "A" 's backfill to
            osd 1 is

            > incomplete and stopped) not OK, but this is the case...

            > --> In this event, why does osd 1 accept IO to pg
            "A" knowing full well,

            > that it's data is outdated and will cause an
            inconsistent state?

            > Wouldn't it be prudent to deny io to pg "A" until
            either

            > -osd 2 comes back (therefore we have a clean osd in the
            acting group)

            > ... backfill would continue to osd 1 of course

            > -or data in pg "A" is manually marked as lost, and then
            continues

            > operation from osd 1 's (outdated) copy?

            _______________________________________________

            ceph-users mailing list

            ceph-users@xxxxxxxxxxxxxx

            http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

          
      _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

    
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com