Re: [ceph-users] data corruption with hammer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'll  miss the Ceph community as well. There was a few things I really
wanted to work in with Ceph.

I got this:

update_object_version oid 13 v 1166 (ObjNum 1028 snap 0 seq_num 1028)
dirty exists
1038:  left oid 13 (ObjNum 1028 snap 0 seq_num 1028)
1040:  finishing write tid 1 to nodez23350-256
1040:  finishing write tid 2 to nodez23350-256
1040:  finishing write tid 3 to nodez23350-256
1040:  finishing write tid 4 to nodez23350-256
1040:  finishing write tid 6 to nodez23350-256
1035: done (4 left)
1037: done (3 left)
1038: done (2 left)
1043: read oid 430 snap -1
1043:  expect (ObjNum 429 snap 0 seq_num 429)
1040:  finishing write tid 7 to nodez23350-256
update_object_version oid 256 v 661 (ObjNum 1029 snap 0 seq_num 1029)
dirty exists
1040:  left oid 256 (ObjNum 1029 snap 0 seq_num 1029)
1042:  expect (ObjNum 664 snap 0 seq_num 664)
1043: Error: oid 430 read returned error code -2
./test/osd/RadosModel.h: In function 'virtual void
ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fa1bf7fe700 time
2016-03-17 10:47:19.085414
./test/osd/RadosModel.h: 1109: FAILED assert(0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x76) [0x4db956]
2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c]
3: (()+0x9791d) [0x7fa1d472191d]
4: (()+0x72519) [0x7fa1d46fc519]
5: (()+0x13c178) [0x7fa1d47c6178]
6: (()+0x80a4) [0x7fa1d425a0a4]
7: (clone()+0x6d) [0x7fa1d2bd504d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted

I had to toggle writeback/forward and min_read_recency_for_promote a
few times to get it, but I don't know if it is because I only have one
job running. Even with six jobs running, it is not easy to trigger
with ceph_test_rados, but it is very instant in the RBD VMs.

Here are the six run crashes (I have about the last 2000 lines of each
if needed):

nodev:
update_object_version oid 1015 v 1255 (ObjNum 1014 snap 0 seq_num
1014) dirty exists
1015:  left oid 1015 (ObjNum 1014 snap 0 seq_num 1014)
1016:  finishing write tid 1 to nodev21799-1016
1016:  finishing write tid 2 to nodev21799-1016
1016:  finishing write tid 3 to nodev21799-1016
1016:  finishing write tid 4 to nodev21799-1016
1016:  finishing write tid 6 to nodev21799-1016
1016:  finishing write tid 7 to nodev21799-1016
update_object_version oid 1016 v 1957 (ObjNum 1015 snap 0 seq_num
1015) dirty exists
1016:  left oid 1016 (ObjNum 1015 snap 0 seq_num 1015)
1017:  finishing write tid 1 to nodev21799-1017
1017:  finishing write tid 2 to nodev21799-1017
1017:  finishing write tid 3 to nodev21799-1017
1017:  finishing write tid 5 to nodev21799-1017
1017:  finishing write tid 6 to nodev21799-1017
update_object_version oid 1017 v 1010 (ObjNum 1016 snap 0 seq_num
1016) dirty exists
1017:  left oid 1017 (ObjNum 1016 snap 0 seq_num 1016)
1018:  finishing write tid 1 to nodev21799-1018
1018:  finishing write tid 2 to nodev21799-1018
1018:  finishing write tid 3 to nodev21799-1018
1018:  finishing write tid 4 to nodev21799-1018
1018:  finishing write tid 6 to nodev21799-1018
1018:  finishing write tid 7 to nodev21799-1018
update_object_version oid 1018 v 1093 (ObjNum 1017 snap 0 seq_num
1017) dirty exists
1018:  left oid 1018 (ObjNum 1017 snap 0 seq_num 1017)
1019:  finishing write tid 1 to nodev21799-1019
1019:  finishing write tid 2 to nodev21799-1019
1019:  finishing write tid 3 to nodev21799-1019
1019:  finishing write tid 5 to nodev21799-1019
1019:  finishing write tid 6 to nodev21799-1019
update_object_version oid 1019 v 462 (ObjNum 1018 snap 0 seq_num 1018)
dirty exists
1019:  left oid 1019 (ObjNum 1018 snap 0 seq_num 1018)
1021:  finishing write tid 1 to nodev21799-1021
1020:  finishing write tid 1 to nodev21799-1020
1020:  finishing write tid 2 to nodev21799-1020
1020:  finishing write tid 3 to nodev21799-1020
1020:  finishing write tid 5 to nodev21799-1020
1020:  finishing write tid 6 to nodev21799-1020
update_object_version oid 1020 v 1287 (ObjNum 1019 snap 0 seq_num
1019) dirty exists
1020:  left oid 1020 (ObjNum 1019 snap 0 seq_num 1019)
1021:  finishing write tid 2 to nodev21799-1021
1021:  finishing write tid 3 to nodev21799-1021
1021:  finishing write tid 5 to nodev21799-1021
1021:  finishing write tid 6 to nodev21799-1021
update_object_version oid 1021 v 1077 (ObjNum 1020 snap 0 seq_num
1020) dirty exists
1021:  left oid 1021 (ObjNum 1020 snap 0 seq_num 1020)
1022:  finishing write tid 1 to nodev21799-1022
1022:  finishing write tid 2 to nodev21799-1022
1022:  finishing write tid 3 to nodev21799-1022
1022:  finishing write tid 5 to nodev21799-1022
1022:  finishing write tid 6 to nodev21799-1022
update_object_version oid 1022 v 1213 (ObjNum 1021 snap 0 seq_num
1021) dirty exists
1022:  left oid 1022 (ObjNum 1021 snap 0 seq_num 1021)
1023:  finishing write tid 1 to nodev21799-1023
1023:  finishing write tid 2 to nodev21799-1023
1023:  finishing write tid 3 to nodev21799-1023
1023:  finishing write tid 5 to nodev21799-1023
1023:  finishing write tid 6 to nodev21799-1023
update_object_version oid 1023 v 2612 (ObjNum 1022 snap 0 seq_num
1022) dirty exists
1023:  left oid 1023 (ObjNum 1022 snap 0 seq_num 1022)
1024:  finishing write tid 1 to nodev21799-1024
1025: Error: oid 219 read returned error code -2
./test/osd/RadosModel.h: In function 'virtual void
ReadOp::_finish(TestOp::CallbackInfo*)' thread 7f0df8a16700 time
2016-03-17 10:53:43.493575
./test/osd/RadosModel.h: 1109: FAILED assert(0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x76) [0x4db956]
2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c]
3: (()+0x9791d) [0x7f0e015dd91d]
4: (()+0x72519) [0x7f0e015b8519]
5: (()+0x13c178) [0x7f0e01682178]
6: (()+0x80a4) [0x7f0e011160a4]
7: (clone()+0x6d) [0x7f0dffa9104d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted

nodew:
1117:  expect (ObjNum 8 snap 0 seq_num 8)
1120:  expect (ObjNum 95 snap 0 seq_num 95)
1121:  expect (ObjNum 994 snap 0 seq_num 994)
1113:  expect (ObjNum 362 snap 0 seq_num 362)
1118:  expect (ObjNum 179 snap 0 seq_num 179)
1115:  expect (ObjNum 943 snap 0 seq_num 943)
1119:  expect (ObjNum 250 snap 0 seq_num 250)
1124:  finishing write tid 1 to nodew21820-361
1124:  finishing write tid 2 to nodew21820-361
1124:  finishing write tid 3 to nodew21820-361
1124:  finishing write tid 4 to nodew21820-361
1124:  finishing write tid 6 to nodew21820-361
1124:  finishing write tid 7 to nodew21820-361
update_object_version oid 361 v 892 (ObjNum 1061 snap 0 seq_num 1061)
dirty exists
1124:  left oid 361 (ObjNum 1061 snap 0 seq_num 1061)
1125:  finishing write tid 1 to nodew21820-486
1125:  finishing write tid 2 to nodew21820-486
1125:  finishing write tid 3 to nodew21820-486
1125:  finishing write tid 5 to nodew21820-486
1125:  finishing write tid 6 to nodew21820-486
update_object_version oid 486 v 1317 (ObjNum 1062 snap 0 seq_num 1062)
dirty exists
1125:  left oid 486 (ObjNum 1062 snap 0 seq_num 1062)
1126:  expect (ObjNum 289 snap 0 seq_num 289)
1127:  finishing write tid 1 to nodew21820-765
1127:  finishing write tid 2 to nodew21820-765
1127:  finishing write tid 3 to nodew21820-765
1127:  finishing write tid 5 to nodew21820-765
1127:  finishing write tid 6 to nodew21820-765
update_object_version oid 765 v 1156 (ObjNum 1063 snap 0 seq_num 1063)
dirty exists
1127:  left oid 765 (ObjNum 1063 snap 0 seq_num 1063)
1128:  finishing write tid 1 to nodew21820-40
1128:  finishing write tid 2 to nodew21820-40
1128:  finishing write tid 3 to nodew21820-40
1128:  finishing write tid 5 to nodew21820-40
1128:  finishing write tid 6 to nodew21820-40
update_object_version oid 40 v 876 (ObjNum 1064 snap 0 seq_num 1064)
dirty exists
1128:  left oid 40 (ObjNum 1064 snap 0 seq_num 1064)
1129:  expect (ObjNum 616 snap 0 seq_num 616)
1110: done (14 left)
1113: done (13 left)
1115: done (12 left)
1117: done (11 left)
1118: done (10 left)
1119: done (9 left)
1120: done (8 left)
1121: done (7 left)
1124: done (6 left)
1125: done (5 left)
1126: done (4 left)
1127: done (3 left)
1128: done (2 left)
1129: done (1 left)
1131: read oid 29 snap -1
1131:  expect (ObjNum 28 snap 0 seq_num 28)
1132: read oid 764 snap -1
1132:  expect (ObjNum 763 snap 0 seq_num 763)
1133: read oid 469 snap -1
1133:  expect (ObjNum 468 snap 0 seq_num 468)
1134: write oid 243 current snap is 0
1134:  seq_num 1065 ranges
{483354=596553,1514502=531232,2509844=632287,3283353=1}
1134:  writing nodew21820-243 from 483354 to 1079907 tid 1
1134:  writing nodew21820-243 from 1514502 to 2045734 tid 2
1134:  writing nodew21820-243 from 2509844 to 3142131 tid 3
1134:  writing nodew21820-243 from 3283353 to 3283354 tid 4
1135: read oid 569 snap -1
1135:  expect (ObjNum 568 snap 0 seq_num 568)
1133: Error: oid 469 read returned error code -2
./test/osd/RadosModel.h: In function 'virtual void
ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fae71d03700 time
2016-03-17 11:00:02.124951
./test/osd/RadosModel.h: 1109: FAILED assert(0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x76) [0x4db956]
2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c]
3: (()+0x9791d) [0x7fae7a8ca91d]
4: (()+0x72519) [0x7fae7a8a5519]
5: (()+0x13c178) [0x7fae7a96f178]
6: (()+0x80a4) [0x7fae7a4030a4]
7: (clone()+0x6d) [0x7fae78d7e04d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted

nodex:

1024:  finishing write tid 1 to nodex22014-1024
1025:  expect (ObjNum 75 snap 0 seq_num 75)
1024:  finishing write tid 2 to nodex22014-1024
1024:  finishing write tid 3 to nodex22014-1024
1024:  finishing write tid 5 to nodex22014-1024
1024:  finishing write tid 6 to nodex22014-1024
update_object_version oid 1024 v 753 (ObjNum 1023 snap 0 seq_num 1023)
dirty exists
1024:  left oid 1024 (ObjNum 1023 snap 0 seq_num 1023)
982: done (44 left)
983: done (43 left)
984: done (42 left)
985: done (41 left)
986: done (40 left)
987: done (39 left)
988: done (38 left)
989: done (37 left)
990: done (36 left)
991: done (35 left)
992: done (34 left)
993: done (33 left)
994: done (32 left)
995: done (31 left)
996: done (30 left)
997: done (29 left)
998: done (28 left)
999: done (27 left)
1000: done (26 left)
1001: done (25 left)
1002: done (24 left)
1003: done (23 left)
1004: done (22 left)
1005: done (21 left)
1006: done (20 left)
1007: done (19 left)
1008: done (18 left)
1009: done (17 left)
1010: done (16 left)
1011: done (15 left)
1012: done (14 left)
1013: done (13 left)
1014: done (12 left)
1015: done (11 left)
1016: done (10 left)
1017: done (9 left)
1018: done (8 left)
1019: done (7 left)
1020: done (6 left)
1021: done (5 left)
1022: done (4 left)
1023: done (3 left)
1024: done (2 left)
1025: done (1 left)
1026: done (0 left)
1027: delete oid 101 current snap is 0
1027: done (0 left)
1028: read oid 156 snap -1
1028:  expect (ObjNum 155 snap 0 seq_num 155)
1029: read oid 691 snap -1
1029:  expect (ObjNum 690 snap 0 seq_num 690)
1030: read oid 282 snap -1
1030:  expect (ObjNum 281 snap 0 seq_num 281)
1031: read oid 979 snap -1
1031:  expect (ObjNum 978 snap 0 seq_num 978)
1032: read oid 203 snap -1
1032:  expect (ObjNum 202 snap 0 seq_num 202)
1033: setattr oid 464 current snap is 0
1032: Error: oid 203 read returned error code -2
./test/osd/RadosModel.h: In function 'virtual void
ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fafee64a700 time
2016-03-17 10:53:44.291343
./test/osd/RadosModel.h: 1109: FAILED assert(0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x76) [0x4db956]
2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c]
3: (()+0x9791d) [0x7faff721191d]
4: (()+0x72519) [0x7faff71ec519]
5: (()+0x13c178) [0x7faff72b6178]
6: (()+0x80a4) [0x7faff6d4a0a4]
7: (clone()+0x6d) [0x7faff56c504d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted

nodey:

974: done (52 left)
975: done (51 left)
976: done (50 left)
977: done (49 left)
978: done (48 left)
979: done (47 left)
980: done (46 left)
981: done (45 left)
982: done (44 left)
983: done (43 left)
984: done (42 left)
985: done (41 left)
986: done (40 left)
987: done (39 left)
988: done (38 left)
989: done (37 left)
990: done (36 left)
991: done (35 left)
992: done (34 left)
993: done (33 left)
994: done (32 left)
995: done (31 left)
996: done (30 left)
997: done (29 left)
998: done (28 left)
999: done (27 left)
1000: done (26 left)
1001: done (25 left)
1002: done (24 left)
1003: done (23 left)
1004: done (22 left)
1005: done (21 left)
1006: done (20 left)
1007: done (19 left)
1008: done (18 left)
1009: done (17 left)
1010: done (16 left)
1011: done (15 left)
1012: done (14 left)
1013: done (13 left)
1014: done (12 left)
1015: done (11 left)
1016: done (10 left)
1017: done (9 left)
1018: done (8 left)
1019: done (7 left)
1020: done (6 left)
1021: done (5 left)
1022: done (4 left)
1023: done (3 left)
1024: done (2 left)
1025: done (1 left)
1026: done (0 left)
1027: delete oid 101 current snap is 0
1027: done (0 left)
1028: read oid 156 snap -1
1028:  expect (ObjNum 155 snap 0 seq_num 155)
1029: read oid 691 snap -1
1029:  expect (ObjNum 690 snap 0 seq_num 690)
1030: read oid 282 snap -1
1030:  expect (ObjNum 281 snap 0 seq_num 281)
1031: read oid 979 snap -1
1031:  expect (ObjNum 978 snap 0 seq_num 978)
1032: read oid 203 snap -1
1032:  expect (ObjNum 202 snap 0 seq_num 202)
1033: setattr oid 464 current snap is 0
1028: Error: oid 156 read returned error code -2
./test/osd/RadosModel.h: In function 'virtual void
ReadOp::_finish(TestOp::CallbackInfo*)' thread 7f55fa3c6700 time
2016-03-17 10:53:57.082571
./test/osd/RadosModel.h: 1109: FAILED assert(0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x76) [0x4db956]
2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c]
3: (()+0x9791d) [0x7f5602f8d91d]
4: (()+0x72519) [0x7f5602f68519]
5: (()+0x13c178) [0x7f5603032178]
6: (()+0x80a4) [0x7f5602ac60a4]
7: (clone()+0x6d) [0x7f560144104d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted

nodez:

1014: done (11 left)
1026: delete oid 717 current snap is 0
1015:  finishing write tid 2 to nodez24249-1015
1015:  finishing write tid 4 to nodez24249-1015
1015:  finishing write tid 5 to nodez24249-1015
update_object_version oid 1015 v 3003 (ObjNum 1014 snap 0 seq_num
1014) dirty exists
1015:  left oid 1015 (ObjNum 1014 snap 0 seq_num 1014)
1016:  finishing write tid 1 to nodez24249-1016
1016:  finishing write tid 2 to nodez24249-1016
1016:  finishing write tid 3 to nodez24249-1016
1016:  finishing write tid 4 to nodez24249-1016
1016:  finishing write tid 6 to nodez24249-1016
1016:  finishing write tid 7 to nodez24249-1016
update_object_version oid 1016 v 1201 (ObjNum 1015 snap 0 seq_num
1015) dirty exists
1016:  left oid 1016 (ObjNum 1015 snap 0 seq_num 1015)
1017:  finishing write tid 1 to nodez24249-1017
1017:  finishing write tid 2 to nodez24249-1017
1017:  finishing write tid 3 to nodez24249-1017
1017:  finishing write tid 5 to nodez24249-1017
1017:  finishing write tid 6 to nodez24249-1017
update_object_version oid 1017 v 3007 (ObjNum 1016 snap 0 seq_num
1016) dirty exists
1017:  left oid 1017 (ObjNum 1016 snap 0 seq_num 1016)
1018:  finishing write tid 1 to nodez24249-1018
1018:  finishing write tid 2 to nodez24249-1018
1018:  finishing write tid 3 to nodez24249-1018
1018:  finishing write tid 4 to nodez24249-1018
1018:  finishing write tid 6 to nodez24249-1018
1018:  finishing write tid 7 to nodez24249-1018
update_object_version oid 1018 v 1283 (ObjNum 1017 snap 0 seq_num
1017) dirty exists
1018:  left oid 1018 (ObjNum 1017 snap 0 seq_num 1017)
1019:  finishing write tid 1 to nodez24249-1019
1019:  finishing write tid 2 to nodez24249-1019
1019:  finishing write tid 3 to nodez24249-1019
1019:  finishing write tid 5 to nodez24249-1019
1019:  finishing write tid 6 to nodez24249-1019
update_object_version oid 1019 v 999 (ObjNum 1018 snap 0 seq_num 1018)
dirty exists
1019:  left oid 1019 (ObjNum 1018 snap 0 seq_num 1018)
1020:  finishing write tid 1 to nodez24249-1020
1020:  finishing write tid 2 to nodez24249-1020
1020:  finishing write tid 3 to nodez24249-1020
1020:  finishing write tid 5 to nodez24249-1020
1020:  finishing write tid 6 to nodez24249-1020
update_object_version oid 1020 v 813 (ObjNum 1019 snap 0 seq_num 1019)
dirty exists
1020:  left oid 1020 (ObjNum 1019 snap 0 seq_num 1019)
1021:  finishing write tid 1 to nodez24249-1021
1021:  finishing write tid 2 to nodez24249-1021
1021:  finishing write tid 3 to nodez24249-1021
1021:  finishing write tid 5 to nodez24249-1021
1021:  finishing write tid 6 to nodez24249-1021
update_object_version oid 1021 v 1038 (ObjNum 1020 snap 0 seq_num
1020) dirty exists
1021:  left oid 1021 (ObjNum 1020 snap 0 seq_num 1020)
1022:  finishing write tid 1 to nodez24249-1022
1022:  finishing write tid 2 to nodez24249-1022
1022:  finishing write tid 3 to nodez24249-1022
1022:  finishing write tid 5 to nodez24249-1022
1022:  finishing write tid 6 to nodez24249-1022
update_object_version oid 1022 v 781 (ObjNum 1021 snap 0 seq_num 1021)
dirty exists
1022:  left oid 1022 (ObjNum 1021 snap 0 seq_num 1021)
1023:  finishing write tid 1 to nodez24249-1023
1023:  finishing write tid 2 to nodez24249-1023
1023:  finishing write tid 3 to nodez24249-1023
1023:  finishing write tid 5 to nodez24249-1023
1023:  finishing write tid 6 to nodez24249-1023
update_object_version oid 1023 v 1537 (ObjNum 1022 snap 0 seq_num
1022) dirty exists
1023:  left oid 1023 (ObjNum 1022 snap 0 seq_num 1022)
1024:  finishing write tid 1 to nodez24249-1024
1025: Error: oid 230 read returned error code -2
./test/osd/RadosModel.h: In function 'virtual void
ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fd9bb7fe700 time
2016-03-17 10:53:41.757921
./test/osd/RadosModel.h: 1109: FAILED assert(0)
 ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x76) [0x4db956]
 2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c]
 3: (()+0x9791d) [0x7fd9d088d91d]
 4: (()+0x72519) [0x7fd9d0868519]
 5: (()+0x13c178) [0x7fd9d0932178]
 6: (()+0x80a4) [0x7fd9d03c60a4]
 7: (clone()+0x6d) [0x7fd9ced4104d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted

nodezz:

1015:  finishing write tid 1 to nodezz25161-1015
1015:  finishing write tid 2 to nodezz25161-1015
1015:  finishing write tid 4 to nodezz25161-1015
1015:  finishing write tid 5 to nodezz25161-1015
update_object_version oid 1015 v 900 (ObjNum 1014 snap 0 seq_num 1014)
dirty exists
1015:  left oid 1015 (ObjNum 1014 snap 0 seq_num 1014)
1016:  finishing write tid 1 to nodezz25161-1016
1016:  finishing write tid 2 to nodezz25161-1016
1016:  finishing write tid 3 to nodezz25161-1016
1016:  finishing write tid 4 to nodezz25161-1016
1016:  finishing write tid 6 to nodezz25161-1016
1016:  finishing write tid 7 to nodezz25161-1016
update_object_version oid 1016 v 1021 (ObjNum 1015 snap 0 seq_num
1015) dirty exists
1016:  left oid 1016 (ObjNum 1015 snap 0 seq_num 1015)
1017:  finishing write tid 1 to nodezz25161-1017
1017:  finishing write tid 2 to nodezz25161-1017
1017:  finishing write tid 3 to nodezz25161-1017
1017:  finishing write tid 5 to nodezz25161-1017
1017:  finishing write tid 6 to nodezz25161-1017
update_object_version oid 1017 v 3011 (ObjNum 1016 snap 0 seq_num
1016) dirty exists
1017:  left oid 1017 (ObjNum 1016 snap 0 seq_num 1016)
1018:  finishing write tid 1 to nodezz25161-1018
1018:  finishing write tid 2 to nodezz25161-1018
1018:  finishing write tid 3 to nodezz25161-1018
1018:  finishing write tid 4 to nodezz25161-1018
1018:  finishing write tid 6 to nodezz25161-1018
1018:  finishing write tid 7 to nodezz25161-1018
update_object_version oid 1018 v 1099 (ObjNum 1017 snap 0 seq_num
1017) dirty exists
1018:  left oid 1018 (ObjNum 1017 snap 0 seq_num 1017)
1019:  finishing write tid 1 to nodezz25161-1019
1019:  finishing write tid 2 to nodezz25161-1019
1019:  finishing write tid 3 to nodezz25161-1019
1019:  finishing write tid 5 to nodezz25161-1019
1019:  finishing write tid 6 to nodezz25161-1019
update_object_version oid 1019 v 1300 (ObjNum 1018 snap 0 seq_num
1018) dirty exists
1019:  left oid 1019 (ObjNum 1018 snap 0 seq_num 1018)
1020:  finishing write tid 1 to nodezz25161-1020
1020:  finishing write tid 2 to nodezz25161-1020
1020:  finishing write tid 3 to nodezz25161-1020
1020:  finishing write tid 5 to nodezz25161-1020
1020:  finishing write tid 6 to nodezz25161-1020
update_object_version oid 1020 v 1324 (ObjNum 1019 snap 0 seq_num
1019) dirty exists
1020:  left oid 1020 (ObjNum 1019 snap 0 seq_num 1019)
1021:  finishing write tid 1 to nodezz25161-1021
1021:  finishing write tid 2 to nodezz25161-1021
1021:  finishing write tid 3 to nodezz25161-1021
1021:  finishing write tid 5 to nodezz25161-1021
1021:  finishing write tid 6 to nodezz25161-1021
update_object_version oid 1021 v 890 (ObjNum 1020 snap 0 seq_num 1020)
dirty exists
1021:  left oid 1021 (ObjNum 1020 snap 0 seq_num 1020)
1022:  finishing write tid 1 to nodezz25161-1022
1022:  finishing write tid 2 to nodezz25161-1022
1022:  finishing write tid 3 to nodezz25161-1022
1022:  finishing write tid 5 to nodezz25161-1022
1022:  finishing write tid 6 to nodezz25161-1022
update_object_version oid 1022 v 464 (ObjNum 1021 snap 0 seq_num 1021)
dirty exists
1022:  left oid 1022 (ObjNum 1021 snap 0 seq_num 1021)
1023:  finishing write tid 1 to nodezz25161-1023
1023:  finishing write tid 2 to nodezz25161-1023
1023:  finishing write tid 3 to nodezz25161-1023
1023:  finishing write tid 5 to nodezz25161-1023
1023:  finishing write tid 6 to nodezz25161-1023
update_object_version oid 1023 v 1516 (ObjNum 1022 snap 0 seq_num
1022) dirty exists
1023:  left oid 1023 (ObjNum 1022 snap 0 seq_num 1022)
1024:  finishing write tid 1 to nodezz25161-1024
1024:  finishing write tid 2 to nodezz25161-1024
1025: Error: oid 219 read returned error code -2
./test/osd/RadosModel.h: In function 'virtual void
ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fbb1bfff700 time
2016-03-17 10:53:53.071338
./test/osd/RadosModel.h: 1109: FAILED assert(0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x76) [0x4db956]
2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c]
3: (()+0x9791d) [0x7fbb30ff191d]
4: (()+0x72519) [0x7fbb30fcc519]
5: (()+0x13c178) [0x7fbb31096178]
6: (()+0x80a4) [0x7fbb30b2a0a4]
7: (clone()+0x6d) [0x7fbb2f4a504d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Thu, Mar 17, 2016 at 10:39 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> On Thu, 17 Mar 2016, Robert LeBlanc wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA256
>>
>> I'm having trouble finding documentation about using ceph_test_rados. Can I
>> run this on the existing cluster and will that provide useful info? It seems
>>  running it in the build will not have the caching set up (vstart.sh).
>>
>> I have accepted a job with another company and only have until Wednesday to
>> help with getting information about this bug. My new job will not be using C
>> eph, so I won't be able to provide any additional info after Tuesday. I want
>>  to leave the company on a good trajectory for upgrading, so any input you c
>> an provide will be helpful.
>
> I'm sorry to hear it!  You'll be missed.  :)
>
>> I've found:
>>
>> ./ceph_test_rados --op read 100 --op write 100 --op delete 50
>> - --max-ops 400000 --objects 1024 --max-in-flight 64 --size 4000000
>> - --min-stride-size 400000 --max-stride-size 800000 --max-seconds 600
>> - --op copy_from 50 --op snap_create 50 --op snap_remove 50 --op
>> rollback 50 --op setattr 25 --op rmattr 25 --pool unique_pool_0
>>
>> Is that enough if I change --pool to the cached pool and do the toggling whi
>> le ceph_test_rados is running? I think this will run for 10 minutes.
>
> Precisely.  You can probably drop copy_from and snap ops from the list
> since your workload wasn't exercising those.
>
> Thanks!
> sage
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux