osd crashed while there was no space

han vincent <hangzws@xxxxxxxxx> · Mon, 17 Nov 2014 23:00:06 +0800



hello, every one:

    These days a problem of "ceph" has troubled me for a long time.

    I build a cluster with 3 hosts and each host has three osds in it.
And after that
I used the command "rados bench 360 -p data -b 4194304 -t 300 write
--no-cleanup"
to test the write performance of the cluster.

    When the cluster is near full, there couldn't write any data to
it. Unfortunately,
there was a host hung up, then a lots of PG was going to migrate to other OSDs.
After a while, a lots of OSD was marked down and out, my cluster couldn't work
any more.

    The following is the output of "ceph -s":

    cluster 002c3742-ab04-470f-8a7a-ad0658b547d6
    health HEALTH_ERR 103 pgs degraded; 993 pgs down; 617 pgs
incomplete; 1008 pgs peering; 12 pgs recovering; 534 pgs stale; 1625
pgs stuck inactive; 534 pgs stuck stale; 1728 pgs stuck unclean;
recovery 945/29649 objects degraded (3.187%); 1 full osd(s); 1 mons
down, quorum 0,2 2,1
     monmap e1: 3 mons at
{0=10.0.0.97:6789/0,1=10.0.0.98:6789/0,2=10.0.0.70:6789/0}, election
epoch 40, quorum 0,2 2,1
     osdmap e173: 9 osds: 2 up, 2 in
            flags full
      pgmap v1779: 1728 pgs, 3 pools, 39528 MB data, 9883 objects
            37541 MB used, 3398 MB / 40940 MB avail
            945/29649 objects degraded (3.187%)
                  34 stale+active+degraded+remapped
                 176 stale+incomplete
                 320 stale+down+peering
                  53 active+degraded+remapped
                 408 incomplete
                   1 active+recovering+degraded
                 673 down+peering
                   1 stale+active+degraded
                  15 remapped+peering
                   3 stale+active+recovering+degraded+remapped
                   3 active+degraded
                  33 remapped+incomplete
                   8 active+recovering+degraded+remapped

    The following is the output of "ceph osd tree":
    # id    weight  type name       up/down reweight
    -1      9       root default
    -3      9               rack unknownrack
    -2      3                       host 10.0.0.97
     0       1                               osd.0   down    0
     1       1                               osd.1   down    0
     2       1                               osd.2   down    0
     -4      3                       host 10.0.0.98
     3       1                               osd.3   down    0
     4       1                               osd.4   down    0
     5       1                               osd.5   down    0
     -5      3                       host 10.0.0.70
     6       1                               osd.6   up      1
     7       1                               osd.7   up      1
     8       1                               osd.8   down    0

The following is part of output os osd.0.log

    -3> 2014-11-14 17:33:02.166022 7fd9dd1ab700  0
filestore(/data/osd/osd.0)  error (28) No space left on device not
handled on operation 10 (15804.0.13, or op 13, counting from 0)
    -2> 2014-11-14 17:33:02.216768 7fd9dd1ab700  0
filestore(/data/osd/osd.0) ENOSPC handling not implemented
    -1> 2014-11-14 17:33:02.216783 7fd9dd1ab700  0
filestore(/data/osd/osd.0)  transaction dump:
    ...
    ...
    0> 2014-11-14 17:33:02.541008 7fd9dd1ab700 -1 os/FileStore.cc: In
function 'unsigned int
FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int,
ThreadPool::TPHandle*)' thread 7fd9dd1ab700             time
2014-11-14 17:33:02.251570
      os/FileStore.cc: 2540: FAILED assert(0 == "unexpected error")

      ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x85) [0x17f8675]
     2: (FileStore::_do_transaction(ObjectStore::Transaction&,
unsigned long, int, ThreadPool::TPHandle*)+0x4855)         [0x1534c21]
     3: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*,
std::allocator<ObjectStore::Transaction*> >&,      unsigned long,
ThreadPool::TPHandle*)+0x101) [0x152d67d]
     4: (FileStore::_do_op(FileStore::OpSequencer*,
ThreadPool::TPHandle&)+0x57b) [0x152bdc3]
     5: (FileStore::OpWQ::_process(FileStore::OpSequencer*,
ThreadPool::TPHandle&)+0x2f) [0x1553c6f]
     6: (ThreadPool::WorkQueue<FileStore::OpSequencer>::_void_process(void*,
ThreadPool::TPHandle&)+0x37)      [0x15625e7]
     7: (ThreadPool::worker(ThreadPool::WorkThread*)+0x7a4) [0x18801de]
     8: (ThreadPool::WorkThread::entry()+0x23) [0x1881f2d]
     9: (Thread::_entry_func(void*)+0x23) [0x1998117]
    10: (()+0x79d1) [0x7fd9e92bf9d1]
    11: (clone()+0x6d) [0x7fd9e78ca9dd]
    NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.

    It seens the error code was ENOSPC(No space left), why the osd
program exited with "assert" at
this time? If there was no space left, why the cluster should choose
to migrate? Only osd.6
and osd.7 was alive. I tried to restarted other OSDs, but after a
while, there osds crashed again.
And now I can't read the data any more.
    Is it a bug? Anyone can help me?
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com