Re: Ceph and its failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299)
>> 
>> Ceph contains
>>  MON: 3
>>  OSD: 3
>>
> For completeness sake, the OSDs are on 3 different hosts, right?

It is single machine. I`m doing tests only.

>> File system: ZFS
> That is the odd one out, very few people I'm aware of use it, support for
> it is marginal at best.
> And some of its features may of course obscure things.

I`m using ZFS on linux for a log time and I`m happy with it.


> Exact specification please, as in how is ZFS configured (single disk,
> raid-z, etc)?

2 disks in mirror mode.

>> Kernel: 4.2.6
>> 
> While probably not related, I vaguely remember 4.3 being recommended for
> use with Ceph.

At this time I can run only this kernel. But IF I decide to use Ceph (only if Ceph satisfy requirements) I can use any other kernel.

>> 3. Does Ceph have auto heal option?
> No. 
> And neither is the repair function a good idea w/o checking the data on
> disk first.
> This is my biggest pet peeve with Ceph and you will find it mentioned
> frequently in this ML, just a few days ago this thread for example:
> "pg repair behavior? (Was: Re: getting rid of misplaced objects)"

It is very strange to recovery data manually without know which data is good.
If I have 3 copies of data and 2 of them are corrupted then I cat recovery the bad one.


------------------

Did some new test. Now new 3 OSD are in different systems. FS is ext3

Same start as before.

# grep "aaaaaaaaa" * -R
Binary file osd/nmz-5/current/17.17_head/rbd\udata.1bef77ac761fb.0000000000000001__head_FB98F317__11 matches
Binary file osd/nmz-5-journal/journal matches

# ceph pg dump | grep 17.17
dumped all in format plain
17.17   1       0       0       0       0       4096    1       1       active+clean    2016-02-23 16:14:32.234638      291'1   309:44  [5,4,3] 5       [5,4,3] 5       0'0     2016-02-22 20:30:04.255301      0'0     2016-02-22 20:30:04.255301

# md5sum rbd\\udata.1bef77ac761fb.0000000000000001__head_FB98F317__11 
\c2642965410d118c7fe40589a34d2463  rbd\\udata.1bef77ac761fb.0000000000000001__head_FB98F317__11

# sed -i -r 's/aaaaaaaaaa/abaaaaaaaa/g' rbd\\udata.1bef77ac761fb.0000000000000001__head_FB98F317__11


# ceph pg deep-scrub 17.17

7fbd99e6c700  0 log_channel(cluster) log [INF] : 17.17 deep-scrub starts
7fbd97667700  0 log_channel(cluster) log [INF] : 17.17 deep-scrub ok

-- restartind OSD.5

# ceph pg deep-scrub 17.17

7f00f40b8700  0 log_channel(cluster) log [INF] : 17.17 deep-scrub starts
7f00f68bd700 -1 log_channel(cluster) log [ERR] : 17.17 shard 5: soid 17/fb98f317/rbd_data.1bef77ac761fb.0000000000000001/head data_digest 0x389d90f6 != known data_digest 0x4f18a4a5 from auth shard 3, missing attr _, missing attr snapset
7f00f68bd700 -1 log_channel(cluster) log [ERR] : 17.17 deep-scrub 0 missing, 1 inconsistent objects
7f00f68bd700 -1 log_channel(cluster) log [ERR] : 17.17 deep-scrub 1 errors


Ceph 9.2.0 bug ?


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux