Re: Missing clones

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm not quite sure how to interpret this, but there are different objects referenced. From the first log output you pasted:

2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9 10:9defb021:::rbd_data.2313975238e1f29.000000000002cbb5:head expected clone 10:9defb021:::rbd_data.2313975238e1f29.000000000002cbb5:64e 1 missing

From the failed PG import the logs mention two different objects:

Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.0000000000001d82:head#
snapset 0=[]:{}
Write #10:9de973fe:::rbd_data.966489238e1f29.000000000000250b:18#

And your last log output has another two different objects:

Write #10:9df3943b:::rbd_data.e57feb238e1f29.000000000003c2e1:head#
snapset 0=[]:{}
Write #10:9df399dd:::rbd_data.4401c7238e1f29.000000000000050d:19#


So in total we're seeing five different rbd_data objects here:

 - rbd_data.2313975238e1f29
 - rbd_data.f5b8603d1b58ba
 - rbd_data.966489238e1f29
 - rbd_data.e57feb238e1f29
 - rbd_data.4401c7238e1f29

This doesn't make too much sense to me, yet. Which ones are belongig to your corrupted VM? Do you have a backup of the VM in case the repair fails?


Zitat von Karsten Becker <karsten.becker@xxxxxxxxxxx>:

Nope:

Write #10:9df3943b:::rbd_data.e57feb238e1f29.000000000003c2e1:head#
snapset 0=[]:{}
Write #10:9df399dd:::rbd_data.4401c7238e1f29.000000000000050d:19#
Write #10:9df399dd:::rbd_data.4401c7238e1f29.000000000000050d:23#
Write #10:9df399dd:::rbd_data.4401c7238e1f29.000000000000050d:head#
snapset 612=[23,22,15]:{19=[15],23=[23,22]}
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function 'void SnapMapper::add_oid(const hobject_t&, const std::set<snapid_t>&, MapCacher::Transaction<std::__cxx11::basic_string<char>, ceph::buffer::list>*)' thread 7fd45147a400 time 2018-02-20 13:56:20.672430 /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED assert(r == -2) ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fd4478c68f2] 2: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t, std::less<snapid_t>, std::allocator<snapid_t> > const&, MapCacher::Transaction<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::list>*)+0x8e9) [0x556930765fe9] 3: (get_attrs(ObjectStore*, coll_t, ghobject_t, ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&, SnapMapper&)+0xafb) [0x5569304ca01b] 4: (ObjectStoreTool::get_object(ObjectStore*, coll_t, ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) [0x5569304caae8] 5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ObjectStore::Sequencer&)+0x1135) [0x5569304d12f5]
 6: (main()+0x3909) [0x556930432349]
 7: (__libc_start_main()+0xf1) [0x7fd444d252b1]
 8: (_start()+0x2a) [0x5569304ba01a]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
*** Caught signal (Aborted) **
 in thread 7fd45147a400 thread_name:ceph-objectstor
ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)
 1: (()+0x913f14) [0x556930ae1f14]
 2: (()+0x110c0) [0x7fd44619e0c0]
 3: (gsignal()+0xcf) [0x7fd444d37fcf]
 4: (abort()+0x16a) [0x7fd444d393fa]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x7fd4478c6a7e] 6: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t, std::less<snapid_t>, std::allocator<snapid_t> > const&, MapCacher::Transaction<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::list>*)+0x8e9) [0x556930765fe9] 7: (get_attrs(ObjectStore*, coll_t, ghobject_t, ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&, SnapMapper&)+0xafb) [0x5569304ca01b] 8: (ObjectStoreTool::get_object(ObjectStore*, coll_t, ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) [0x5569304caae8] 9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ObjectStore::Sequencer&)+0x1135) [0x5569304d12f5]
 10: (main()+0x3909) [0x556930432349]
 11: (__libc_start_main()+0xf1) [0x7fd444d252b1]
 12: (_start()+0x2a) [0x5569304ba01a]
Aborted



What I also do not understand: If I take your approach of finding out
what is stored in the PG, I get no match with my PG ID anymore.

If I take the approach of "rbd info" which was posted by Mykola Golub, I
get a match - unfortunately the most important VM on our system which
holds the software for our Finance.

Best
Karsten









On 20.02.2018 09:16, Eugen Block wrote:
And does the re-import of the PG work? From the logs I assumed that the
snapshot(s) prevented a successful import, but now that they are deleted
it could work.


Zitat von Karsten Becker <karsten.becker@xxxxxxxxxxx>:

Hi Eugen,

hmmm, that should be :

rbd -p cpVirtualMachines list | while read LINE; do osdmaptool
--test-map-object $LINE --pool 10 osdmap 2>&1; rbd snap ls
cpVirtualMachines/$LINE | grep -v SNAPID | awk '{ print $2 }' | while
read LINE2; do echo "$LINE"; osdmaptool --test-map-object $LINE2
--pool 10 osdmap 2>&1; done; done | less

It's a Proxmox system. There were only two snapshots on the PG, which I
deleted now. Now nothing gets displayed on the PG... is that possible? A
repair still fails unfortunately...

Best & thank you for the hint!
Karsten



On 19.02.2018 22:42, Eugen Block wrote:
BTW - how can I find out, which RBDs are affected by this problem.
Maybe
a copy/remove of the affected RBDs could help? But how to find out to
which RBDs this PG belongs to?

Depending on how many PGs your cluster/pool has, you could dump your
osdmap and then run the osdmaptool [1] for every rbd object in your pool
and grep for the affected PG. That would be quick for a few objects, I
guess:

ceph1:~ # ceph osd getmap -o /tmp/osdmap

ceph1:~ # osdmaptool --test-map-object image1 --pool 5 /tmp/osdmap
osdmaptool: osdmap file '/tmp/osdmap'
 object 'image1' -> 5.2 -> [0]

ceph1:~ # osdmaptool --test-map-object image2 --pool 5 /tmp/osdmap
osdmaptool: osdmap file '/tmp/osdmap'
 object 'image2' -> 5.f -> [0]


[1]
https://www.hastexo.com/resources/hints-and-kinks/which-osd-stores-specific-rados-object/



Zitat von Karsten Becker <karsten.becker@xxxxxxxxxxx>:

BTW - how can I find out, which RBDs are affected by this problem.
Maybe
a copy/remove of the affected RBDs could help? But how to find out to
which RBDs this PG belongs to?

Best
Karsten

On 19.02.2018 19:26, Karsten Becker wrote:
Hi.

Thank you for the tip. I just tried... but unfortunately the import
aborts:

Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.0000000000001d82:head#
snapset 0=[]:{}
Write #10:9de973fe:::rbd_data.966489238e1f29.000000000000250b:18#
Write #10:9de973fe:::rbd_data.966489238e1f29.000000000000250b:24#
Write #10:9de973fe:::rbd_data.966489238e1f29.000000000000250b:head#
snapset 628=[24,21,17]:{18=[17],24=[24,21]}
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function
'void SnapMapper::add_oid(const hobject_t&, const
std::set<snapid_t>&,
MapCacher::Transaction<std::__cxx11::basic_string<char>,
ceph::buffer::list>*)' thread 7facba7de400 time 2018-02-19
19:24:18.917515
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED
assert(r == -2)
 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)
luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x102) [0x7facb0c2a8f2]
 2: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t,
std::less<snapid_t>, std::allocator<snapid_t> > const&,
MapCacher::Transaction<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >,
ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
 3: (get_attrs(ObjectStore*, coll_t, ghobject_t,
ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,
SnapMapper&)+0xafb) [0x55eef35f901b]
 4: (ObjectStoreTool::get_object(ObjectStore*, coll_t,
ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738)
[0x55eef35f9ae8]
 5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >, ObjectStore::Sequencer&)+0x1135)
[0x55eef36002f5]
 6: (main()+0x3909) [0x55eef3561349]
 7: (__libc_start_main()+0xf1) [0x7facae0892b1]
 8: (_start()+0x2a) [0x55eef35e901a]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
*** Caught signal (Aborted) **
 in thread 7facba7de400 thread_name:ceph-objectstor
 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)
luminous (stable)
 1: (()+0x913f14) [0x55eef3c10f14]
 2: (()+0x110c0) [0x7facaf5020c0]
 3: (gsignal()+0xcf) [0x7facae09bfcf]
 4: (abort()+0x16a) [0x7facae09d3fa]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x28e) [0x7facb0c2aa7e]
 6: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t,
std::less<snapid_t>, std::allocator<snapid_t> > const&,
MapCacher::Transaction<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >,
ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
 7: (get_attrs(ObjectStore*, coll_t, ghobject_t,
ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,
SnapMapper&)+0xafb) [0x55eef35f901b]
 8: (ObjectStoreTool::get_object(ObjectStore*, coll_t,
ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738)
[0x55eef35f9ae8]
 9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >, ObjectStore::Sequencer&)+0x1135)
[0x55eef36002f5]
 10: (main()+0x3909) [0x55eef3561349]
 11: (__libc_start_main()+0xf1) [0x7facae0892b1]
 12: (_start()+0x2a) [0x55eef35e901a]
Aborted

Best
Karsten

On 19.02.2018 17:09, Eugen Block wrote:
Could [1] be of interest?
Exporting the intact PG and importing it back to the rescpective OSD
sounds promising.

[1]
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-July/019673.html




Zitat von Karsten Becker <karsten.becker@xxxxxxxxxxx>:

Hi.

We have size=3 min_size=2.

But this "upgrade" has been done during the weekend. We had size=2
min_size=1 before.

Best
Karsten



On 19.02.2018 13:02, Eugen Block wrote:
Hi,

just to rule out the obvious, which size does the pool have? You
aren't
running it with size = 2, do you?


Zitat von Karsten Becker <karsten.becker@xxxxxxxxxxx>:

Hi,

I have one damaged PG in my cluster. All OSDs are BlueStore. How
do I
fix this?

2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9
10:9defb021:::rbd_data.2313975238e1f29.000000000002cbb5:head
expected
clone
10:9defb021:::rbd_data.2313975238e1f29.000000000002cbb5:64e 1
missing
2018-02-19 11:00:23.183707 osd.29 [INF] repair 10.7b9
10:9defb021:::rbd_data.2313975238e1f29.000000000002cbb5:head 1
missing clone(s)
2018-02-19 11:01:18.074666 mon.0 [ERR] Health check update:
Possible
data damage: 1 pg inconsistent (PG_DAMAGED)
2018-02-19 11:01:11.856529 osd.29 [ERR] 10.7b9 repair 1 errors, 0
fixed
2018-02-19 11:01:24.333533 mon.0 [ERR] overall HEALTH_ERR 1 scrub
errors; Possible data damage: 1 pg inconsistent

"ceph pg repair 10.7b9" fails and is not able to fix ist. A
manually
started scrub "ceph pg scrub 10.7b9" also.

Best from Berlin/Germany
Karsten


Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin
(Charlottenburg), HRB 57947
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin
(Charlottenburg), HRB 57947
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin
(Charlottenburg), HRB 57947
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin
(Charlottenburg), HRB 57947
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin
(Charlottenburg), HRB 57947
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin (Charlottenburg), HRB 57947
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Eugen Block                             voice   : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG      fax     : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg                         e-mail  : eblock@xxxxxx

        Vorsitzende des Aufsichtsrates: Angelika Mozdzen
          Sitz und Registergericht: Hamburg, HRB 90934
                  Vorstand: Jens-U. Mozdzen
                   USt-IdNr. DE 814 013 983

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux