Frozen Erasure-coded-pool

Max Power <maillists@xxxxxxxxxxxxxxxxxxxxxxxxxxx> · Wed, 17 Dec 2014 20:57:09 +0100 (CET)

I am trying to setup a small VM ceph cluster to excersise before creating a real
cluster. Currently there are two osd's on the same host. I wanted to create an
erasure coded pool with k=1 and m=1 (yes I know it's stupid, but it is a test
case). On top of it there is a cache tier (writeback) and I used the pool to
make a rados blockdevice with it. But as I wanted to format it with ext4 the
system suddenly hangs. At the moment I do not understand why.

I discovered that after the creation of the 'cold-storage' the active primaries
are setup correctly (about one half of the pgs to osd.0 and the other half to
osd.1). But the second osd in the active group is always nonsense (MAXINT, a
placeholder for 'not there'?). To my suprise the state is 'active+clean' - how
can this be, shouldn't it be 'active+degraded'?

These are the commands I used (from my recollection)
:# ceph osd erasure-code-profile get ec_1_1
> directory=/usr/lib/x86_64-linux-gnu/ceph/erasure-code
> k=1
> m=1
> plugin=jerasure
> ruleset-failure-domain=osd
> technique=reed_sol_van
:# ceph osd pool create liverpool 300 300 erasure ec_1_1
:# ceph osd pool create cache 100 100 replicated
:# ceph osd tier add liverpool cache
:# ceph osd tier cache-mode writeback
:# ceph osd tier set-overlay liverpool cache
:# rbd --pool liverpool create --size 1500 testdisk
:# rbd --pool liverpool map testdisk
:# mkfs.ext4 /dev/rbd/liverpool/testdisk

Now the mkfs freezes and I can see this thru ceph -w:
2014-12-17 19:08:56.466846 mon.0 [INF] pgmap v2062: 400 pgs: 400 active+clean;
140 bytes data, 88220 kB used, 2418 MB / 2504 MB avail; 47 B/s rd, 0 op/s
2014-12-17 19:11:20.697190 mon.0 [INF] pgmap v2064: 400 pgs: 307
stale+active+clean, 93 active+clean; 140 bytes data, 106 MB used, 2397 MB / 2504
MB avail
2014-12-17 19:11:20.388468 osd.1 [WRN] 6 slow requests, 6 included below; oldest
blocked for > 124.270960 secs
2014-12-17 19:11:20.388556 osd.1 [WRN] slow request 124.270960 seconds old,
received at 2014-12-17 19:09:16.116251: osd_op(client.6155.1:508
rb.0.1807.2ae8944a.000000000005 [set-alloc-hint object_size 4194304 write_size
4194304,write 4091904~24576] 24.e6ca00e6 ondisk+write e590) v4 currently waiting
for subops from 0
[repeated a few times]
2014-12-17 19:11:21.911696 mon.0 [INF] osdmap e592: 2 osds: 1 up, 2 in
2014-12-17 19:11:22.053272 mon.0 [INF] pgmap v2065: 400 pgs: 307
stale+active+clean, 93 active+clean; 140 bytes data, 106 MB used, 2397 MB / 2504
MB avail
2014-12-17 19:11:24.826008 mon.0 [INF] osd.0 10.0.0.141:6800/7919 boot
2014-12-17 19:11:24.827218 mon.0 [INF] osdmap e593: 2 osds: 2 up, 2 in
2014-12-17 19:11:24.935173 mon.0 [INF] pgmap v2066: 400 pgs: 307
stale+active+clean, 93 active+clean; 140 bytes data, 106 MB used, 2397 MB / 2504
MB avail
2014-12-17 19:11:26.072303 mon.0 [INF] osdmap e594: 2 osds: 2 up, 2 in
2014-12-17 19:11:26.220102 mon.0 [INF] pgmap v2067: 400 pgs: 307
stale+active+clean, 93 active+clean; 140 bytes data, 106 MB used, 2397 MB / 2504
MB avail
2014-12-17 19:11:30.702281 mon.0 [INF] pgmap v2068: 400 pgs: 307
stale+active+clean, 93 active+clean; 16308 kB data, 138 MB used, 2366 MB / 2504
MB avail; 1471 kB/s wr, 7 op/s; 2184 kB/s, 0 objects/s recovering
2014-12-17 19:11:32.050330 mon.0 [INF] pgmap v2069: 400 pgs: 400 active+clean;
33924 kB data, 167 MB used, 2337 MB / 2504 MB avail; 4543 kB/s wr, 46 op/s; 3565
kB/s, 1 objects/s recovering
2014-12-17 19:13:30.569447 mon.0 [INF] pgmap v2070: 400 pgs: 400 active+clean;
33924 kB data, 143 MB used, 2361 MB / 2504 MB avail

How is this explained? What have I done wrong?

Greetings!
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com