Crash with rados cppool and snapshots

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

We are exploring options to regularly preserve (i.e. backup) the
contents of the pools backing our rados gateways. For that we create
nightly snapshots of all the relevant pools when there is no activity
on the system to get consistent states.

In order to restore the whole pools back to a specific snapshot state,
we tried to use the rados cppool command (see below) to copy a snapshot
state into a new pool. Unfortunately this causes a segfault. Are we
doing anything wrong?

This command:

rados cppool --snap snap-1 deleteme.lp deleteme.lp2 2> segfault.txt

Produces this output:

*** Caught signal (Segmentation fault) ** in thread 7f8f49a927c0 ceph
version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) 1: rados()
[0x43eedf] 2: (()+0x10340) [0x7f8f48738340] 3:
(librados::IoCtxImpl::snap_lookup(char const*, unsigned long*)+0x17)
[0x7f8f48aff127] 4: (main()+0x1385) [0x411e75] 5:
(__libc_start_main()+0xf5) [0x7f8f4795fec5] 6: rados() [0x41c6f7]
2014-10-29 12:03:22.761653 7f8f49a927c0 -1 *** Caught signal
(Segmentation fault) ** in thread 7f8f49a927c0

 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) 1:
 rados() [0x43eedf] 2: (()+0x10340) [0x7f8f48738340] 3:
 (librados::IoCtxImpl::snap_lookup(char const*, unsigned long*)+0x17)
 [0x7f8f48aff127] 4: (main()+0x1385) [0x411e75] 5:
 (__libc_start_main()+0xf5) [0x7f8f4795fec5] 6: rados() [0x41c6f7] NOTE:
 a copy of the executable, or `objdump -rdS <executable>` is needed to
 interpret this.

Full segfault file and the objdump output for the rados command can be
found here:

- https://public.centerdevice.de/53bddb80-423e-4213-ac62-59fe8dbb9bea
- https://public.centerdevice.de/50b81566-41fb-439a-b58b-e1e32d75f32a

We updated to the 0.80.7 release (saw the issue with 0.80.5 before and
had hoped that the long list of bugfixes in the release notes would
include a fix for this) but are still seeing it. Rados gateways, OSDs,
MONs etc. have all been restarted after the update. Package versions 
as follows:

daniel.schneller@node01 [~] $  
➜  dpkg -l | grep ceph
ii  ceph                                0.80.7-1trusty 
ii  ceph-common                         0.80.7-1trusty 
ii  ceph-fs-common                      0.80.7-1trusty 
ii  ceph-fuse                           0.80.7-1trusty 
ii  ceph-mds                            0.80.7-1trusty 
ii  libcephfs1                          0.80.7-1trusty 
ii  python-ceph                         0.80.7-1trusty 

daniel.schneller@node01 [~] $  
➜  uname -a
Linux node01 3.13.0-27-generic #50-Ubuntu SMP Thu May 15 18:06:16 
   UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Copying without the snapshot works. Should this work at least in 
theory?

Thanks! 

Daniel

-- 
Daniel Schneller
Mobile Development Lead
 
CenterDevice GmbH                  | Merscheider Straße 1
                                   | 42699 Solingen
tel: +49 1754155711                | Deutschland
daniel.schneller@xxxxxxxxxxxxxxxx  | www.centerdevice.com




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux