Re: Upgrade from 0.47.2 to 0.48 - osd crashes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It seems this was caused by problems with the underlying filesystem. I
was able to solve it by rebooting, not sure what the
problem was but there were errors i dmesg about it (using btrfs here).

On Tue, Jul 3, 2012 at 10:00 AM, John Axel Eriksson <john@xxxxxxxxx> wrote:
> So I first upgraded the mon, then went ahead and upgraded one of the
> osds which crashed and keeps crashing - probably when trying to
> upgrade the filestore.
> How should I proceed? FS is btrfs, one mon two osds. This is the conf
> for the osds:
>
> [osd]
>         osd data = /srv/osd.$id
>         osd journal = /srv/osd.$id.journal
>         osd journal size = 1000
>
> Here's the log output:
>
> root@ceph-osd-0:~# tail -F /var/log/ceph/ceph-osd.0.log
>
> --- end dump of recent events ---
> 2012-07-03 07:46:49.543030 7f2d0f332780  0 filestore(/srv/osd.0) mount
> FIEMAP ioctl is supported and appears to work
> 2012-07-03 07:46:49.543095 7f2d0f332780  0 filestore(/srv/osd.0) mount
> FIEMAP ioctl is disabled via 'filestore fiemap' config option
> 2012-07-03 07:46:50.049179 7f2d0f332780  0 filestore(/srv/osd.0) mount
> detected btrfs
> 2012-07-03 07:46:50.049435 7f2d0f332780  0 filestore(/srv/osd.0) mount
> btrfs CLONE_RANGE ioctl is supported
> 2012-07-03 07:46:50.898083 7f2d0f332780  0 filestore(/srv/osd.0) mount
> btrfs SNAP_CREATE is supported
> 2012-07-03 07:46:50.937621 7f2d0f332780  0 filestore(/srv/osd.0) mount
> btrfs SNAP_DESTROY is supported
> 2012-07-03 07:46:51.164967 7f2d0f332780  0 filestore(/srv/osd.0) mount
> btrfs START_SYNC is supported (transid 44619)
> 2012-07-03 07:46:51.426019 7f2d0f332780  0 filestore(/srv/osd.0) mount
> btrfs WAIT_SYNC is supported
> 2012-07-03 07:46:51.664258 7f2d0f332780  0 filestore(/srv/osd.0) mount
> btrfs SNAP_CREATE_V2 is supported
> 2012-07-03 07:46:52.535058 7f2d0f332780  0 filestore(/srv/osd.0) mount
> syncfs(2) syscall fully supported (by glibc and kernel)
> 2012-07-03 07:46:52.535529 7f2d0f332780 -1 filestore(/srv/osd.0)
> FileStore::mount : stale version stamp detected: 2. Proceeding,
> do_update is set, performing disk format upgrade.
> 2012-07-03 07:46:52.535683 7f2d0f332780  0 filestore(/srv/osd.0) mount
> found snaps <650707,650708>
> 2012-07-03 07:47:04.317867 7f2d0f332780  0 filestore(/srv/osd.0)
> mount: enabling PARALLEL journal mode: btrfs, SNAP_CREATE_V2 detected
> and 'filestore btrfs snap' mode is enabled
> 2012-07-03 07:47:04.354557 7f2d0f332780  1 journal _open
> /srv/osd.0.journal fd 23: 1048576000 bytes, block size 4096 bytes,
> directio = 1, aio = 0
> 2012-07-03 07:48:47.415664 7f2d0f332780  1 journal _open
> /srv/osd.0.journal fd 23: 1048576000 bytes, block size 4096 bytes,
> directio = 1, aio = 0
> 2012-07-03 07:48:47.416522 7f2d0f332780 -1 FileStore is old at version
> 2.  Updating...
> 2012-07-03 07:48:47.416535 7f2d0f332780 -1 Removing tmp pgs
> 2012-07-03 07:49:51.029035 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:49:56.029265 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:50:01.029413 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:50:06.029531 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:50:11.029651 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:50:16.029764 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:50:18.959270 7f2d08be4700  1 heartbeat_map reset_timeout
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:50:19.273378 7f2d0f332780 -1 Getting collections
> 2012-07-03 07:50:19.273399 7f2d0f332780 -1 834 to process.
> 2012-07-03 07:50:19.274588 7f2d0f332780 -1 0/833 processed
> 2012-07-03 07:50:19.274651 7f2d0f332780 -1 Updating collection meta
> current version is 2
> 2012-07-03 07:51:21.031150 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:51:26.031309 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:51:31.031430 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:51:36.031592 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:51:41.031732 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:51:46.031878 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:51:51.032004 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:51:56.032134 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:52:01.032255 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:52:06.032381 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:52:09.361844 7f2d08be4700  1 heartbeat_map reset_timeout
> 'FileStore::op_tp thread 0x7f2d08be4700' had timed out after 60
> 2012-07-03 07:53:11.033743 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:53:16.033859 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:53:21.034007 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:53:26.034140 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:53:31.034267 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:53:36.034398 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:53:41.034593 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:53:46.034735 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:53:51.034870 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:53:56.035015 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:01.035153 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:06.035289 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:11.035440 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:16.035581 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:21.035700 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:26.035839 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:31.035992 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:36.036139 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:41.036258 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:46.036394 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:51.036530 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:54:56.036651 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:55:01.036767 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:55:06.036911 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:55:11.037032 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had timed out after 60
> 2012-07-03 07:55:11.037077 7f2d0bbea700  1 heartbeat_map is_healthy
> 'FileStore::op_tp thread 0x7f2d083e3700' had suicide timed out after
> 180
> 2012-07-03 07:55:11.038755 7f2d0bbea700 -1 common/HeartbeatMap.cc: In
> function 'bool ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*,
> const char*, time_t)' thread 7f2d0bbea700 time 2012-07-03
> 07:55:11.037140
> common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
>
>  ceph version 0.48argonaut (commit:c2b20ca74249892c8e5e40c12aa14446a2bf2030)
>  1: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char
> const*, long)+0x26a) [0x8285aa]
>  2: (ceph::HeartbeatMap::is_healthy()+0x87) [0x828d87]
>  3: (ceph::HeartbeatMap::check_touch_file()+0x23) [0x828fc3]
>  4: (CephContextServiceThread::entry()+0x54) [0x7aa734]
>  5: (()+0x7e9a) [0x7f2d0e7c4e9a]
>  6: (clone()+0x6d) [0x7f2d0d4644bd]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
>
> --- begin dump of recent events ---
>    -66> 2012-07-03 07:46:41.019588 7f2d0f332780  0 ceph version
> 0.48argonaut (commit:c2b20ca74249892c8e5e40c12aa14446a2bf2030),
> process ceph-osd, pid 3070
>    -65> 2012-07-03 07:46:49.543030 7f2d0f332780  0
> filestore(/srv/osd.0) mount FIEMAP ioctl is supported and appears to
> work
>    -64> 2012-07-03 07:46:49.543095 7f2d0f332780  0
> filestore(/srv/osd.0) mount FIEMAP ioctl is disabled via 'filestore
> fiemap' config option
>    -63> 2012-07-03 07:46:50.049179 7f2d0f332780  0
> filestore(/srv/osd.0) mount detected btrfs
>    -62> 2012-07-03 07:46:50.049435 7f2d0f332780  0
> filestore(/srv/osd.0) mount btrfs CLONE_RANGE ioctl is supported
>    -61> 2012-07-03 07:46:50.898083 7f2d0f332780  0
> filestore(/srv/osd.0) mount btrfs SNAP_CREATE is supported
>    -60> 2012-07-03 07:46:50.937621 7f2d0f332780  0
> filestore(/srv/osd.0) mount btrfs SNAP_DESTROY is supported
>    -59> 2012-07-03 07:46:51.164967 7f2d0f332780  0
> filestore(/srv/osd.0) mount btrfs START_SYNC is supported (transid
> 44619)
>    -58> 2012-07-03 07:46:51.426019 7f2d0f332780  0
> filestore(/srv/osd.0) mount btrfs WAIT_SYNC is supported
>    -57> 2012-07-03 07:46:51.664258 7f2d0f332780  0
> filestore(/srv/osd.0) mount btrfs SNAP_CREATE_V2 is supported
>    -56> 2012-07-03 07:46:52.535058 7f2d0f332780  0
> filestore(/srv/osd.0) mount syncfs(2) syscall fully supported (by
> glibc and kernel)
>    -55> 2012-07-03 07:46:52.535529 7f2d0f332780 -1
> filestore(/srv/osd.0) FileStore::mount : stale version stamp detected:
> 2. Proceeding, do_update is set, performing disk format upgrade.
>    -54> 2012-07-03 07:46:52.535683 7f2d0f332780  0
> filestore(/srv/osd.0) mount found snaps <650707,650708>
>    -53> 2012-07-03 07:47:04.317867 7f2d0f332780  0
> filestore(/srv/osd.0) mount: enabling PARALLEL journal mode: btrfs,
> SNAP_CREATE_V2 detected and 'filestore btrfs snap' mode is enabled
>    -52> 2012-07-03 07:47:04.354557 7f2d0f332780  1 journal _open
> /srv/osd.0.journal fd 23: 1048576000 bytes, block size 4096 bytes,
> directio = 1, aio = 0
>    -51> 2012-07-03 07:48:47.415664 7f2d0f332780  1 journal _open
> /srv/osd.0.journal fd 23: 1048576000 bytes, block size 4096 bytes,
> directio = 1, aio = 0
>    -50> 2012-07-03 07:48:47.416522 7f2d0f332780 -1 FileStore is old at
> version 2.  Updating...
>    -49> 2012-07-03 07:48:47.416535 7f2d0f332780 -1 Removing tmp pgs
>    -48> 2012-07-03 07:49:51.029035 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -47> 2012-07-03 07:49:56.029265 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -46> 2012-07-03 07:50:01.029413 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -45> 2012-07-03 07:50:06.029531 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -44> 2012-07-03 07:50:11.029651 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -43> 2012-07-03 07:50:16.029764 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -42> 2012-07-03 07:50:18.959270 7f2d08be4700  1 heartbeat_map
> reset_timeout 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -41> 2012-07-03 07:50:19.273378 7f2d0f332780 -1 Getting collections
>    -40> 2012-07-03 07:50:19.273399 7f2d0f332780 -1 834 to process.
>    -39> 2012-07-03 07:50:19.274588 7f2d0f332780 -1 0/833 processed
>    -38> 2012-07-03 07:50:19.274651 7f2d0f332780 -1 Updating collection
> meta current version is 2
>    -37> 2012-07-03 07:51:21.031150 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -36> 2012-07-03 07:51:26.031309 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -35> 2012-07-03 07:51:31.031430 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -34> 2012-07-03 07:51:36.031592 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -33> 2012-07-03 07:51:41.031732 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -32> 2012-07-03 07:51:46.031878 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -31> 2012-07-03 07:51:51.032004 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -30> 2012-07-03 07:51:56.032134 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -29> 2012-07-03 07:52:01.032255 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -28> 2012-07-03 07:52:06.032381 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -27> 2012-07-03 07:52:09.361844 7f2d08be4700  1 heartbeat_map
> reset_timeout 'FileStore::op_tp thread 0x7f2d08be4700' had timed out
> after 60
>    -26> 2012-07-03 07:53:11.033743 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -25> 2012-07-03 07:53:16.033859 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -24> 2012-07-03 07:53:21.034007 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -23> 2012-07-03 07:53:26.034140 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -22> 2012-07-03 07:53:31.034267 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -21> 2012-07-03 07:53:36.034398 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -20> 2012-07-03 07:53:41.034593 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -19> 2012-07-03 07:53:46.034735 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -18> 2012-07-03 07:53:51.034870 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -17> 2012-07-03 07:53:56.035015 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -16> 2012-07-03 07:54:01.035153 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -15> 2012-07-03 07:54:06.035289 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -14> 2012-07-03 07:54:11.035440 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -13> 2012-07-03 07:54:16.035581 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -12> 2012-07-03 07:54:21.035700 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -11> 2012-07-03 07:54:26.035839 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>    -10> 2012-07-03 07:54:31.035992 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>     -9> 2012-07-03 07:54:36.036139 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>     -8> 2012-07-03 07:54:41.036258 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>     -7> 2012-07-03 07:54:46.036394 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>     -6> 2012-07-03 07:54:51.036530 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>     -5> 2012-07-03 07:54:56.036651 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>     -4> 2012-07-03 07:55:01.036767 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>     -3> 2012-07-03 07:55:06.036911 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>     -2> 2012-07-03 07:55:11.037032 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had timed out
> after 60
>     -1> 2012-07-03 07:55:11.037077 7f2d0bbea700  1 heartbeat_map
> is_healthy 'FileStore::op_tp thread 0x7f2d083e3700' had suicide timed
> out after 180
>      0> 2012-07-03 07:55:11.038755 7f2d0bbea700 -1
> common/HeartbeatMap.cc: In function 'bool
> ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, const char*,
> time_t)' thread 7f2d0bbea700 time 2012-07-03 07:55:11.037140
> common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
>
>  ceph version 0.48argonaut (commit:c2b20ca74249892c8e5e40c12aa14446a2bf2030)
>  1: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char
> const*, long)+0x26a) [0x8285aa]
>  2: (ceph::HeartbeatMap::is_healthy()+0x87) [0x828d87]
>  3: (ceph::HeartbeatMap::check_touch_file()+0x23) [0x828fc3]
>  4: (CephContextServiceThread::entry()+0x54) [0x7aa734]
>  5: (()+0x7e9a) [0x7f2d0e7c4e9a]
>  6: (clone()+0x6d) [0x7f2d0d4644bd]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
>
> --- end dump of recent events ---
> 2012-07-03 07:55:11.043564 7f2d0bbea700 -1 *** Caught signal (Aborted) **
>  in thread 7f2d0bbea700
>
>  ceph version 0.48argonaut (commit:c2b20ca74249892c8e5e40c12aa14446a2bf2030)
>  1: /usr/bin/ceph-osd() [0x6e900a]
>  2: (()+0xfcb0) [0x7f2d0e7cccb0]
>  3: (gsignal()+0x35) [0x7f2d0d3a8445]
>  4: (abort()+0x17b) [0x7f2d0d3abbab]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f2d0dcf669d]
>  6: (()+0xb5846) [0x7f2d0dcf4846]
>  7: (()+0xb5873) [0x7f2d0dcf4873]
>  8: (()+0xb596e) [0x7f2d0dcf496e]
>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x282) [0x79f662]
>  10: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char
> const*, long)+0x26a) [0x8285aa]
>  11: (ceph::HeartbeatMap::is_healthy()+0x87) [0x828d87]
>  12: (ceph::HeartbeatMap::check_touch_file()+0x23) [0x828fc3]
>  13: (CephContextServiceThread::entry()+0x54) [0x7aa734]
>  14: (()+0x7e9a) [0x7f2d0e7c4e9a]
>  15: (clone()+0x6d) [0x7f2d0d4644bd]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
>
> --- begin dump of recent events ---
>      0> 2012-07-03 07:55:11.043564 7f2d0bbea700 -1 *** Caught signal
> (Aborted) **
>  in thread 7f2d0bbea700
>
>  ceph version 0.48argonaut (commit:c2b20ca74249892c8e5e40c12aa14446a2bf2030)
>  1: /usr/bin/ceph-osd() [0x6e900a]
>  2: (()+0xfcb0) [0x7f2d0e7cccb0]
>  3: (gsignal()+0x35) [0x7f2d0d3a8445]
>  4: (abort()+0x17b) [0x7f2d0d3abbab]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f2d0dcf669d]
>  6: (()+0xb5846) [0x7f2d0dcf4846]
>  7: (()+0xb5873) [0x7f2d0dcf4873]
>  8: (()+0xb596e) [0x7f2d0dcf496e]
>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x282) [0x79f662]
>  10: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char
> const*, long)+0x26a) [0x8285aa]
>  11: (ceph::HeartbeatMap::is_healthy()+0x87) [0x828d87]
>  12: (ceph::HeartbeatMap::check_touch_file()+0x23) [0x828fc3]
>  13: (CephContextServiceThread::entry()+0x54) [0x7aa734]
>  14: (()+0x7e9a) [0x7f2d0e7c4e9a]
>  15: (clone()+0x6d) [0x7f2d0d4644bd]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
>
> --- end dump of recent events ---
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux