Hey Andrzej...
As Jan replied, I would first try to recover what I can from the
ceph cluster. For the time being, I would not be concerned with
cephfs.
I would also backup the current OSDs so that, if something goes
wrong, I can go back to the current state.
The recover of the cluster would consist in understanding which data
is lost. I never had to do this, but naively, I would:
0) Stop the mds servers
1) Try to find which PGs were in the OSDs that failed in different
hosts (running 'ceph health detail' or 'ceph pg dump')
2) Mark the failing osds as lost
3) Once you are sure which PGs are unrecoverable, mark them as
lost (ceph pg <id> mark_unfound_lost delete)
4) Remove the osds from the cluster
5) Give him some time and see if it recovers. The idea is to go
into a situation where the cluster only complains about mds
problems. Something likeL
ceph health detail
HEALTH_WARN mds cluster is degraded
Now, I think you are at the point where you can try to think of
recovering the filesystem. I would ask for a new round of help
suggestion when you reach this stage.
I would also wait for further comments on the procedure above since
I never tried it myself. Finally, I would also suggest a good look
to
http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/
Kind Regards
Goncalo
On 08/27/2015 12:36 AM, Jan Schermer
wrote:
Most of the data is still here, but you won't be able to just
"mount" it if it's inconsistent.
I don't use CephFS so someone else could tell you
if it's able to repair the filesystem with some parts missing.
You lost part of the data where the copies were
only on the 1 disk in one node and on either of the disks on
the other node since no other copy exists. How much data you
lost I don't exactly know, but since you only have 16 OSDs I'm
afraid it will be in the order of ~3% probably? How many
"files" are intact is a different question - it could be that
every file is missing 3% of contents which would make the loss
total.
Guys? I have no idea how files map to pgs and
object in CephFS...
Jan
Thank you for answer. I
lost 2 disks on 1st node and 1 disk on 2nd. I
understand it is not possible to recover the data
even partially? Unfortunatelly those disks are lost
forever.
Andrzej
W dniu 2015-08-26 o 12:26, Jan Schermer pisze:
If you lost 3 disks with size 2 and at least 2 of
those disks were in different host, that means you
lost data with the default CRUSH.
There's nothing you can do but either
get those disks back in or recover from backup.
Jan
Hi,
We have ceph cluster (Ceph version 0.94.2)
which consists of four nodes with four
disks on each node. Ceph is configured to
hold two replicas (size 2). We use this
cluster for ceph filesystem. Few days ago
we had power outage after which I had to
replace three of our cluster OSD disks.
All OSD disks are now online, but I'm
unable to mount filesystem and constantly
receive 'mount error 5 = Input/output
error'. Ceph status shows many
'incomplete' pgs and that 'mds cluster is
degraded'. According to 'ceph health
detail' mds is replaying journal.
[root@cnode0 ceph]# ceph -s
cluster
39c717a3-5e15-4e5e-bc54-7e7f1fd0ee24
health HEALTH_WARN
25 pgs backfill_toofull
10 pgs degraded
126 pgs down
263 pgs incomplete
54 pgs stale
10 pgs stuck degraded
263 pgs stuck inactive
54 pgs stuck stale
289 pgs stuck unclean
10 pgs stuck undersized
10 pgs undersized
4 requests are blocked > 32
sec
recovery 27139/10407227
objects degraded (0.261%)
recovery 168597/10407227
objects misplaced (1.620%)
4 near full osd(s)
too many PGs per OSD (312 >
max 300)
mds cluster is
degraded
monmap e6: 6 mons at
{0=x.x.70.1:6789/0,0m=x.x.71.1:6789/0,1=x.x.70.2:6789/0,1m=x.x.71.2:6789/0,2=x.x.70.3:6789/0,2m=x.x.71.3:6789/0}
election epoch 2958, quorum
0,1,2,3,4,5 0,1,2,0m,1m,2m
mdsmap e1236: 1/1/1 up {0=2=up:replay}, 2 up:standby
osdmap e83705: 16 osds: 16 up, 16 in;
26 remapped pgs
pgmap v40869228: 2496 pgs, 3 pools,
16952 GB data, 5046 kobjects
32825 GB used, 11698 GB /
44524 GB avail
27139/10407227 objects
degraded (0.261%)
168597/10407227 objects
misplaced (1.620%)
2153 active+clean
137 incomplete
126 down+incomplete
54 stale+active+clean
15
active+remapped+backfill_toofull
10
active+undersized+degraded+remapped+backfill_toofull
1 active+remapped
[root@cnode0 ceph]#
I wasn't able to find any solution in the
Internet and I worry I will make things
even worse when continue to troubleshoot
this on my own. I'm stuck. Could you
please help?
Thanks.
Andrzej
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW 2006
T: +61 2 93511937
|