Re: Ceph-osd Daemon Receives Segmentation Fault on Trusty After Upgrading to 0.94.10 Release

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Wido;
At weekend i roll back all servers to 0.94.9-1 version and all worked fine with old release.

Today i upgraded all monitor servers and 1 osd server to 0.94.10-1 version. All OSD servers has 2 osds. I update the ceph.conf on the osd server removed debug lines and restart osd daemons. 

This time osd id 3 started and operated successfully but osd id 2 failed again with same segmentation fault. 

I have uploaded new logs as to the same destination as ceph.log.wido.20170321-2.tgz and its link is below again.

https://drive.google.com/drive/folders/0B_hD9LJqrkd7NmtJOW5YUnh6UE0?usp=sharing

Thanks for all your help.

Özhan


On Sun, Mar 19, 2017 at 8:47 PM, Wido den Hollander <wido@xxxxxxxx> wrote:

> Op 17 maart 2017 om 8:39 schreef Özhan Rüzgar Karaman <oruzgarkaraman@xxxxxxxxx>:
>
>
> Hi;
> Yesterday i started to upgrade my Ceph environment from 0.94.9 to 0.94.10.
> All monitor servers upgraded successfully but i experience problems on
> starting upgraded OSD daemons.
>
> When i try to start an Ceph OSD Daemon(/usr/bin/ceph-osd) receives
> Segmentation Fault and it kills after 2-3 minutes. To clarify the issue i
> have role backed Ceph packages on that OSD Server  back to 0.94.9 and
> problematic servers could rejoin to the 0.94.10 cluster.
>
> My environment is standard 14.04.5 Ubuntu Trusty server with 4.4.x kernel
> and i am using standard packages from http://eu.ceph.com/debian-hammer
> nothing special on my environment.
>
> I have uploaded the Ceph OSD Logs to the link below.
>
> https://drive.google.com/drive/folders/0B_hD9LJqrkd7NmtJOW5YUnh6UE0?usp=sharing
>
> And my ceph.conf is below
>
> [global]
> fsid = a3742d34-9b51-4a36-bf56-4defb62b2b8e
> mon_initial_members = mont1, mont2, mont3
> mon_host = 172.16.51.101,172.16.51.102,172.16.51.103
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> filestore_xattr_use_omap = true
> public_network = 172.16.51.0/24
> cluster_network = 172.16.51.0/24
> debug_ms = 0/0
> debug_auth = 0/0
>
> [mon]
> mon_allow_pool_delete = false
> mon_osd_down_out_interval = 300
> osd_pool_default_flag_nodelete = true
>
> [osd]
> filestore_max_sync_interval = 15
> filestore_fiemap = true
> osd_max_backfills = 1
> osd_backfill_scan_min = 16
> osd_backfill_scan_max = 128
> osd_max_scrubs = 1
> osd_scrub_sleep = 1
> osd_scrub_chunk_min = 2
> osd_scrub_chunk_max = 16
> debug_osd = 0/0
> debug_filestore = 0/0
> debug_rbd = 0/0
> debug_rados = 0/0
> debug_journal = 0/0
> debug_journaler = 0/0

Can you try without all the debug_* lines and see what the log then yields?

It's crashing on something which isn't logged now.

Wido

>
> Thanks for all help.
>
> Özhan
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux