Re: All client writes block when 2 of 3 OSDs down

Gregory Farnum <greg@xxxxxxxxxxx> · Thu, 26 Mar 2015 14:39:41 -0700

On Thu, Mar 26, 2015 at 2:30 PM, Lee Revell <rlrevell@xxxxxxxxx> wrote:
> On Thu, Mar 26, 2015 at 4:40 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>
>> Has the OSD actually been detected as down yet?
>>
>
> I believe it has, however I can't directly check because "ceph health"
> starts to hang when I down the second node.

Oh. You need to keep a quorum of your monitors running (just the
monitor processes, not of everything in the system) or nothing at all
is going to work. That's how we prevent split brain issues.

>
>>
>> You'll also need to set that min size on your existing pools ("ceph
>> osd pool <pool> set min_size 1" or similar) to change their behavior;
>> the config option only takes effect for newly-created pools. (Thus the
>> "default".)
>
>
> I've done this, however the behavior is the same:
>
> $ for f in `ceph osd lspools | sed 's/[0-9]//g' | sed 's/,//g'`; do ceph osd
> pool set $f min_size 1; done
> set pool 0 min_size to 1
> set pool 1 min_size to 1
> set pool 2 min_size to 1
> set pool 3 min_size to 1
> set pool 4 min_size to 1
> set pool 5 min_size to 1
> set pool 6 min_size to 1
> set pool 7 min_size to 1
>
> $ ceph -w
>     cluster db460aa2-5129-4aaa-8b2e-43eac727124e
>      health HEALTH_WARN 1 mons down, quorum 0,1 ceph-node-1,ceph-node-2
>      monmap e3: 3 mons at
> {ceph-node-1=192.168.122.121:6789/0,ceph-node-2=192.168.122.131:6789/0,ceph-node-3=192.168.122.141:6789/0},
> election epoch 194, quorum 0,1 ceph-node-1,ceph-node-2
>      mdsmap e94: 1/1/1 up {0=ceph-node-1=up:active}
>      osdmap e362: 3 osds: 2 up, 2 in
>       pgmap v5913: 840 pgs, 8 pools, 7441 MB data, 994 objects
>             25329 MB used, 12649 MB / 40059 MB avail
>                  840 active+clean
>
> 2015-03-26 17:23:56.009938 mon.0 [INF] pgmap v5913: 840 pgs: 840
> active+clean; 7441 MB data, 25329 MB used, 12649 MB / 40059 MB avail
> 2015-03-26 17:25:51.042802 mon.0 [INF] pgmap v5914: 840 pgs: 840
> active+clean; 7441 MB data, 25329 MB used, 12649 MB / 40059 MB avail; 0 B/s
> rd, 260 kB/s wr, 13 op/s
> 2015-03-26 17:25:56.046491 mon.0 [INF] pgmap v5915: 840 pgs: 840
> active+clean; 7441 MB data, 25333 MB used, 12645 MB / 40059 MB avail; 0 B/s
> rd, 943 kB/s wr, 38 op/s
> 2015-03-26 17:26:01.058167 mon.0 [INF] pgmap v5916: 840 pgs: 840
> active+clean; 7441 MB data, 25335 MB used, 12643 MB / 40059 MB avail; 0 B/s
> rd, 10699 kB/s wr, 621 op/s
>
> <this is where i kill the second OSD>
>
> 2015-03-26 17:26:26.778461 7f4ebeffd700  0 monclient: hunting for new mon
> 2015-03-26 17:26:30.701099 7f4ec45f5700  0 -- 192.168.122.111:0/1007741 >>
> 192.168.122.141:6789/0 pipe(0x7f4ec0023200 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7f4ec0023490).fault
> 2015-03-26 17:26:42.701154 7f4ec44f4700  0 -- 192.168.122.111:0/1007741 >>
> 192.168.122.131:6789/0 pipe(0x7f4ec00251b0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7f4ec0025440).fault
>
> And all writes block until I bring back an OSD.
>
> Lee
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com