Hi Greg,
Accepting the fact, that an osd with outdated data can never
accept write, or io of any kind, how is it possible, that the
system goes into this state?
-All osds are Bluestore, checksum, mtime etc.
-All osds are up and in
-No hw failures, lost disks, damaged journals or databases etc.
-The data became inconsistent
Thanks,
Denke.
On 11/02/2017 11:51 PM, Gregory Farnum
wrote:
The
scenario is actually a bit different, see:
Let's assume size=2, min_size=1
-We are looking at pg "A" acting [1, 2]
-osd 1 goes down
-osd 2 accepts a write for pg "A"
-osd 2 goes down
-osd 1 comes back up, while osd 2 still down
-osd 1 has no way to know osd 2 accepted a write in pg "A"
-osd 1 accepts a new write to pg "A"
-osd 2 comes back up.
bang! osd 1 and 2 now have different views of pg "A" but
both claim to
have current data.
In this case, OSD 1 will not accept IO
precisely because it can not prove it has the current data.
That is the basic purpose of OSD peering and holds in all
cases.
-Greg
-K.
On 2017-11-01 20:27, Denes Dolhay wrote:
> Hello,
>
> I have a trick question for Mr. Turner's scenario:
> Let's assume size=2, min_size=1
> -We are looking at pg "A" acting [1, 2]
> -osd 1 goes down, OK
> -osd 1 comes back up, backfill of pg "A" commences from
osd 2 to osd 1, OK
> -osd 2 goes down (and therefore pg "A" 's backfill to
osd 1 is
> incomplete and stopped) not OK, but this is the case...
> --> In this event, why does osd 1 accept IO to pg
"A" knowing full well,
> that it's data is outdated and will cause an
inconsistent state?
> Wouldn't it be prudent to deny io to pg "A" until
either
> -osd 2 comes back (therefore we have a clean osd in the
acting group)
> ... backfill would continue to osd 1 of course
> -or data in pg "A" is manually marked as lost, and then
continues
> operation from osd 1 's (outdated) copy?
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
|