Re: Frozen Erasure-coded-pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Max,

On 17/12/2014 20:57, Max Power wrote:
> I am trying to setup a small VM ceph cluster to excersise before creating a real
> cluster. Currently there are two osd's on the same host. I wanted to create an
> erasure coded pool with k=1 and m=1 (yes I know it's stupid, but it is a test
> case). 

This is going to fail in an undefined way and it should fail early instead of allowing it to proceed. I've created http://tracker.ceph.com/issues/10358, thanks for catching this :-) I'm not sure what to make of the rest of what you observed, it probably is tainted by the fact that the erasure coded pool cannot proceed.

Cheers

> On top of it there is a cache tier (writeback) and I used the pool to
> make a rados blockdevice with it. But as I wanted to format it with ext4 the
> system suddenly hangs. At the moment I do not understand why.
> 
> I discovered that after the creation of the 'cold-storage' the active primaries
> are setup correctly (about one half of the pgs to osd.0 and the other half to
> osd.1). But the second osd in the active group is always nonsense (MAXINT, a
> placeholder for 'not there'?). To my suprise the state is 'active+clean' - how
> can this be, shouldn't it be 'active+degraded'?
> 
> These are the commands I used (from my recollection)
> :# ceph osd erasure-code-profile get ec_1_1
>> directory=/usr/lib/x86_64-linux-gnu/ceph/erasure-code
>> k=1
>> m=1
>> plugin=jerasure
>> ruleset-failure-domain=osd
>> technique=reed_sol_van
> :# ceph osd pool create liverpool 300 300 erasure ec_1_1
> :# ceph osd pool create cache 100 100 replicated
> :# ceph osd tier add liverpool cache
> :# ceph osd tier cache-mode writeback
> :# ceph osd tier set-overlay liverpool cache
> :# rbd --pool liverpool create --size 1500 testdisk
> :# rbd --pool liverpool map testdisk
> :# mkfs.ext4 /dev/rbd/liverpool/testdisk
> 
> Now the mkfs freezes and I can see this thru ceph -w:
> 2014-12-17 19:08:56.466846 mon.0 [INF] pgmap v2062: 400 pgs: 400 active+clean;
> 140 bytes data, 88220 kB used, 2418 MB / 2504 MB avail; 47 B/s rd, 0 op/s
> 2014-12-17 19:11:20.697190 mon.0 [INF] pgmap v2064: 400 pgs: 307
> stale+active+clean, 93 active+clean; 140 bytes data, 106 MB used, 2397 MB / 2504
> MB avail
> 2014-12-17 19:11:20.388468 osd.1 [WRN] 6 slow requests, 6 included below; oldest
> blocked for > 124.270960 secs
> 2014-12-17 19:11:20.388556 osd.1 [WRN] slow request 124.270960 seconds old,
> received at 2014-12-17 19:09:16.116251: osd_op(client.6155.1:508
> rb.0.1807.2ae8944a.000000000005 [set-alloc-hint object_size 4194304 write_size
> 4194304,write 4091904~24576] 24.e6ca00e6 ondisk+write e590) v4 currently waiting
> for subops from 0
> [repeated a few times]
> 2014-12-17 19:11:21.911696 mon.0 [INF] osdmap e592: 2 osds: 1 up, 2 in
> 2014-12-17 19:11:22.053272 mon.0 [INF] pgmap v2065: 400 pgs: 307
> stale+active+clean, 93 active+clean; 140 bytes data, 106 MB used, 2397 MB / 2504
> MB avail
> 2014-12-17 19:11:24.826008 mon.0 [INF] osd.0 10.0.0.141:6800/7919 boot
> 2014-12-17 19:11:24.827218 mon.0 [INF] osdmap e593: 2 osds: 2 up, 2 in
> 2014-12-17 19:11:24.935173 mon.0 [INF] pgmap v2066: 400 pgs: 307
> stale+active+clean, 93 active+clean; 140 bytes data, 106 MB used, 2397 MB / 2504
> MB avail
> 2014-12-17 19:11:26.072303 mon.0 [INF] osdmap e594: 2 osds: 2 up, 2 in
> 2014-12-17 19:11:26.220102 mon.0 [INF] pgmap v2067: 400 pgs: 307
> stale+active+clean, 93 active+clean; 140 bytes data, 106 MB used, 2397 MB / 2504
> MB avail
> 2014-12-17 19:11:30.702281 mon.0 [INF] pgmap v2068: 400 pgs: 307
> stale+active+clean, 93 active+clean; 16308 kB data, 138 MB used, 2366 MB / 2504
> MB avail; 1471 kB/s wr, 7 op/s; 2184 kB/s, 0 objects/s recovering
> 2014-12-17 19:11:32.050330 mon.0 [INF] pgmap v2069: 400 pgs: 400 active+clean;
> 33924 kB data, 167 MB used, 2337 MB / 2504 MB avail; 4543 kB/s wr, 46 op/s; 3565
> kB/s, 1 objects/s recovering
> 2014-12-17 19:13:30.569447 mon.0 [INF] pgmap v2070: 400 pgs: 400 active+clean;
> 33924 kB data, 143 MB used, 2361 MB / 2504 MB avail
> 
> How is this explained? What have I done wrong?
> 
> Greetings!
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux