Am 12.02.2013 11:22, schrieb Wido den Hollander:
On 02/12/2013 10:57 AM, Oliver Liebel wrote:
hello,
i try to get a setup working, where i shut down 2 of 3 nodes,
and ceph should keep still working, e.g. like a raid 1 + spare.
setup is done under precise 12.04 latest patchlevel, ceph version is
0.56.2,
every node acts als mon, mds an osd, using default pools.
i tried different variants, changing replication level (rep size) for
all (default) pools
(data, metadata, rbd) from 2 to 3, min_size to 0 (i know this isnt a
good idea,
I'm not sure if min_size 0 actually works. (Should it be a valid
value?) Have you tried setting it to "1".
Yes.
Also, how does your crushmap look like, that's interesting to see.
nothing changed - generated by default setup:
# begin crush map
# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
# types
type 0 osd
type 1 host
type 2 rack
type 3 row
type 4 room
type 5 datacenter
type 6 root
# buckets
host jake {
id -2 # do not change unnecessarily
# weight 1.000
alg straw
hash 0 # rjenkins1
item osd.0 weight 1.000
}
host elwood {
id -4 # do not change unnecessarily
# weight 1.000
alg straw
hash 0 # rjenkins1
item osd.1 weight 1.000
}
host cab {
id -5 # do not change unnecessarily
# weight 1.000
alg straw
hash 0 # rjenkins1
item osd.2 weight 1.000
}
rack unknownrack {
id -3 # do not change unnecessarily
# weight 3.000
alg straw
hash 0 # rjenkins1
item jake weight 1.000
item elwood weight 1.000
item cab weight 1.000
}
root default {
id -1 # do not change unnecessarily
# weight 3.000
alg straw
hash 0 # rjenkins1
item unknownrack weight 3.000
}
# rules
rule data {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule metadata {
ruleset 1
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule rbd {
ruleset 2
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
What does ceph -s say when you shut down the second node?
Output from ceph -s in "normal" state:
#ceph -s Tue Feb 12
11:36:04 2013
health HEALTH_OK
monmap e1: 3 mons at
{0=192.168.99.33:6789/0,1=192.168.99.34:6789/0,2=192.168.99.35:6789
/0}, election epoch 62, quorum 0,1,2 0,1,2
osdmap e95: 3 osds: 3 up, 3 in
pgmap v1152: 768 pgs: 768 active+clean; 144 KB data, 1569 MB used,
53672 MB / 61437 MB
avail
mdsmap e70: 1/1/1 up {0=2=up:active}, 2 up:standby
and after shutdown of 1st and 2nd node:
root@cab:~# ceph -s
2013-02-12 11:39:16.164414 7fa5d5da3700 0 -- 192.168.99.35:0/17643 >>
192.168.99.34:6789/0 pipe(0x7fa5c0000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1).fault
2013-02-12 11:39:22.186134 7fa5cf683700 0 -- 192.168.99.35:0/17643 >>
192.168.99.33:6789/0 pipe(0x7fa5c00021a0 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-02-12 11:39:25.196626 7fa5d5da3700 0 -- 192.168.99.35:0/17643 >>
192.168.99.34:6789/0 pipe(0x7fa5c0002ed0 sd=4 :0 s=1 pgs=0 cs=0 l=1).fault
2013-02-12 11:39:31.218587 7fa5cf582700 0 -- 192.168.99.35:0/17643 >>
192.168.99.33:6789/0 pipe(0x7fa5c0003f60 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
.. and so on
thanks,
oliver
Wido
just a proof of concept)
but the result is always the same:
if i shut down the first node, everything keeps working,
but after shutdown of the second node, ceph stops working, mount is
unreachable.
any ideas?
thanks in advance
oliver
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com