Also, is this ceph_test_rados rewriting objects quickly? I think that the issue is with rewriting objects so if we can tailor the ceph_test_rados to do that, it might be easier to reproduce. ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Mar 17, 2016 at 11:05 AM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: > I'll miss the Ceph community as well. There was a few things I really > wanted to work in with Ceph. > > I got this: > > update_object_version oid 13 v 1166 (ObjNum 1028 snap 0 seq_num 1028) > dirty exists > 1038: left oid 13 (ObjNum 1028 snap 0 seq_num 1028) > 1040: finishing write tid 1 to nodez23350-256 > 1040: finishing write tid 2 to nodez23350-256 > 1040: finishing write tid 3 to nodez23350-256 > 1040: finishing write tid 4 to nodez23350-256 > 1040: finishing write tid 6 to nodez23350-256 > 1035: done (4 left) > 1037: done (3 left) > 1038: done (2 left) > 1043: read oid 430 snap -1 > 1043: expect (ObjNum 429 snap 0 seq_num 429) > 1040: finishing write tid 7 to nodez23350-256 > update_object_version oid 256 v 661 (ObjNum 1029 snap 0 seq_num 1029) > dirty exists > 1040: left oid 256 (ObjNum 1029 snap 0 seq_num 1029) > 1042: expect (ObjNum 664 snap 0 seq_num 664) > 1043: Error: oid 430 read returned error code -2 > ./test/osd/RadosModel.h: In function 'virtual void > ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fa1bf7fe700 time > 2016-03-17 10:47:19.085414 > ./test/osd/RadosModel.h: 1109: FAILED assert(0) > ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x76) [0x4db956] > 2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c] > 3: (()+0x9791d) [0x7fa1d472191d] > 4: (()+0x72519) [0x7fa1d46fc519] > 5: (()+0x13c178) [0x7fa1d47c6178] > 6: (()+0x80a4) [0x7fa1d425a0a4] > 7: (clone()+0x6d) [0x7fa1d2bd504d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > terminate called after throwing an instance of 'ceph::FailedAssertion' > Aborted > > I had to toggle writeback/forward and min_read_recency_for_promote a > few times to get it, but I don't know if it is because I only have one > job running. Even with six jobs running, it is not easy to trigger > with ceph_test_rados, but it is very instant in the RBD VMs. > > Here are the six run crashes (I have about the last 2000 lines of each > if needed): > > nodev: > update_object_version oid 1015 v 1255 (ObjNum 1014 snap 0 seq_num > 1014) dirty exists > 1015: left oid 1015 (ObjNum 1014 snap 0 seq_num 1014) > 1016: finishing write tid 1 to nodev21799-1016 > 1016: finishing write tid 2 to nodev21799-1016 > 1016: finishing write tid 3 to nodev21799-1016 > 1016: finishing write tid 4 to nodev21799-1016 > 1016: finishing write tid 6 to nodev21799-1016 > 1016: finishing write tid 7 to nodev21799-1016 > update_object_version oid 1016 v 1957 (ObjNum 1015 snap 0 seq_num > 1015) dirty exists > 1016: left oid 1016 (ObjNum 1015 snap 0 seq_num 1015) > 1017: finishing write tid 1 to nodev21799-1017 > 1017: finishing write tid 2 to nodev21799-1017 > 1017: finishing write tid 3 to nodev21799-1017 > 1017: finishing write tid 5 to nodev21799-1017 > 1017: finishing write tid 6 to nodev21799-1017 > update_object_version oid 1017 v 1010 (ObjNum 1016 snap 0 seq_num > 1016) dirty exists > 1017: left oid 1017 (ObjNum 1016 snap 0 seq_num 1016) > 1018: finishing write tid 1 to nodev21799-1018 > 1018: finishing write tid 2 to nodev21799-1018 > 1018: finishing write tid 3 to nodev21799-1018 > 1018: finishing write tid 4 to nodev21799-1018 > 1018: finishing write tid 6 to nodev21799-1018 > 1018: finishing write tid 7 to nodev21799-1018 > update_object_version oid 1018 v 1093 (ObjNum 1017 snap 0 seq_num > 1017) dirty exists > 1018: left oid 1018 (ObjNum 1017 snap 0 seq_num 1017) > 1019: finishing write tid 1 to nodev21799-1019 > 1019: finishing write tid 2 to nodev21799-1019 > 1019: finishing write tid 3 to nodev21799-1019 > 1019: finishing write tid 5 to nodev21799-1019 > 1019: finishing write tid 6 to nodev21799-1019 > update_object_version oid 1019 v 462 (ObjNum 1018 snap 0 seq_num 1018) > dirty exists > 1019: left oid 1019 (ObjNum 1018 snap 0 seq_num 1018) > 1021: finishing write tid 1 to nodev21799-1021 > 1020: finishing write tid 1 to nodev21799-1020 > 1020: finishing write tid 2 to nodev21799-1020 > 1020: finishing write tid 3 to nodev21799-1020 > 1020: finishing write tid 5 to nodev21799-1020 > 1020: finishing write tid 6 to nodev21799-1020 > update_object_version oid 1020 v 1287 (ObjNum 1019 snap 0 seq_num > 1019) dirty exists > 1020: left oid 1020 (ObjNum 1019 snap 0 seq_num 1019) > 1021: finishing write tid 2 to nodev21799-1021 > 1021: finishing write tid 3 to nodev21799-1021 > 1021: finishing write tid 5 to nodev21799-1021 > 1021: finishing write tid 6 to nodev21799-1021 > update_object_version oid 1021 v 1077 (ObjNum 1020 snap 0 seq_num > 1020) dirty exists > 1021: left oid 1021 (ObjNum 1020 snap 0 seq_num 1020) > 1022: finishing write tid 1 to nodev21799-1022 > 1022: finishing write tid 2 to nodev21799-1022 > 1022: finishing write tid 3 to nodev21799-1022 > 1022: finishing write tid 5 to nodev21799-1022 > 1022: finishing write tid 6 to nodev21799-1022 > update_object_version oid 1022 v 1213 (ObjNum 1021 snap 0 seq_num > 1021) dirty exists > 1022: left oid 1022 (ObjNum 1021 snap 0 seq_num 1021) > 1023: finishing write tid 1 to nodev21799-1023 > 1023: finishing write tid 2 to nodev21799-1023 > 1023: finishing write tid 3 to nodev21799-1023 > 1023: finishing write tid 5 to nodev21799-1023 > 1023: finishing write tid 6 to nodev21799-1023 > update_object_version oid 1023 v 2612 (ObjNum 1022 snap 0 seq_num > 1022) dirty exists > 1023: left oid 1023 (ObjNum 1022 snap 0 seq_num 1022) > 1024: finishing write tid 1 to nodev21799-1024 > 1025: Error: oid 219 read returned error code -2 > ./test/osd/RadosModel.h: In function 'virtual void > ReadOp::_finish(TestOp::CallbackInfo*)' thread 7f0df8a16700 time > 2016-03-17 10:53:43.493575 > ./test/osd/RadosModel.h: 1109: FAILED assert(0) > ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x76) [0x4db956] > 2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c] > 3: (()+0x9791d) [0x7f0e015dd91d] > 4: (()+0x72519) [0x7f0e015b8519] > 5: (()+0x13c178) [0x7f0e01682178] > 6: (()+0x80a4) [0x7f0e011160a4] > 7: (clone()+0x6d) [0x7f0dffa9104d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > terminate called after throwing an instance of 'ceph::FailedAssertion' > Aborted > > nodew: > 1117: expect (ObjNum 8 snap 0 seq_num 8) > 1120: expect (ObjNum 95 snap 0 seq_num 95) > 1121: expect (ObjNum 994 snap 0 seq_num 994) > 1113: expect (ObjNum 362 snap 0 seq_num 362) > 1118: expect (ObjNum 179 snap 0 seq_num 179) > 1115: expect (ObjNum 943 snap 0 seq_num 943) > 1119: expect (ObjNum 250 snap 0 seq_num 250) > 1124: finishing write tid 1 to nodew21820-361 > 1124: finishing write tid 2 to nodew21820-361 > 1124: finishing write tid 3 to nodew21820-361 > 1124: finishing write tid 4 to nodew21820-361 > 1124: finishing write tid 6 to nodew21820-361 > 1124: finishing write tid 7 to nodew21820-361 > update_object_version oid 361 v 892 (ObjNum 1061 snap 0 seq_num 1061) > dirty exists > 1124: left oid 361 (ObjNum 1061 snap 0 seq_num 1061) > 1125: finishing write tid 1 to nodew21820-486 > 1125: finishing write tid 2 to nodew21820-486 > 1125: finishing write tid 3 to nodew21820-486 > 1125: finishing write tid 5 to nodew21820-486 > 1125: finishing write tid 6 to nodew21820-486 > update_object_version oid 486 v 1317 (ObjNum 1062 snap 0 seq_num 1062) > dirty exists > 1125: left oid 486 (ObjNum 1062 snap 0 seq_num 1062) > 1126: expect (ObjNum 289 snap 0 seq_num 289) > 1127: finishing write tid 1 to nodew21820-765 > 1127: finishing write tid 2 to nodew21820-765 > 1127: finishing write tid 3 to nodew21820-765 > 1127: finishing write tid 5 to nodew21820-765 > 1127: finishing write tid 6 to nodew21820-765 > update_object_version oid 765 v 1156 (ObjNum 1063 snap 0 seq_num 1063) > dirty exists > 1127: left oid 765 (ObjNum 1063 snap 0 seq_num 1063) > 1128: finishing write tid 1 to nodew21820-40 > 1128: finishing write tid 2 to nodew21820-40 > 1128: finishing write tid 3 to nodew21820-40 > 1128: finishing write tid 5 to nodew21820-40 > 1128: finishing write tid 6 to nodew21820-40 > update_object_version oid 40 v 876 (ObjNum 1064 snap 0 seq_num 1064) > dirty exists > 1128: left oid 40 (ObjNum 1064 snap 0 seq_num 1064) > 1129: expect (ObjNum 616 snap 0 seq_num 616) > 1110: done (14 left) > 1113: done (13 left) > 1115: done (12 left) > 1117: done (11 left) > 1118: done (10 left) > 1119: done (9 left) > 1120: done (8 left) > 1121: done (7 left) > 1124: done (6 left) > 1125: done (5 left) > 1126: done (4 left) > 1127: done (3 left) > 1128: done (2 left) > 1129: done (1 left) > 1131: read oid 29 snap -1 > 1131: expect (ObjNum 28 snap 0 seq_num 28) > 1132: read oid 764 snap -1 > 1132: expect (ObjNum 763 snap 0 seq_num 763) > 1133: read oid 469 snap -1 > 1133: expect (ObjNum 468 snap 0 seq_num 468) > 1134: write oid 243 current snap is 0 > 1134: seq_num 1065 ranges > {483354=596553,1514502=531232,2509844=632287,3283353=1} > 1134: writing nodew21820-243 from 483354 to 1079907 tid 1 > 1134: writing nodew21820-243 from 1514502 to 2045734 tid 2 > 1134: writing nodew21820-243 from 2509844 to 3142131 tid 3 > 1134: writing nodew21820-243 from 3283353 to 3283354 tid 4 > 1135: read oid 569 snap -1 > 1135: expect (ObjNum 568 snap 0 seq_num 568) > 1133: Error: oid 469 read returned error code -2 > ./test/osd/RadosModel.h: In function 'virtual void > ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fae71d03700 time > 2016-03-17 11:00:02.124951 > ./test/osd/RadosModel.h: 1109: FAILED assert(0) > ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x76) [0x4db956] > 2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c] > 3: (()+0x9791d) [0x7fae7a8ca91d] > 4: (()+0x72519) [0x7fae7a8a5519] > 5: (()+0x13c178) [0x7fae7a96f178] > 6: (()+0x80a4) [0x7fae7a4030a4] > 7: (clone()+0x6d) [0x7fae78d7e04d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > terminate called after throwing an instance of 'ceph::FailedAssertion' > Aborted > > nodex: > > 1024: finishing write tid 1 to nodex22014-1024 > 1025: expect (ObjNum 75 snap 0 seq_num 75) > 1024: finishing write tid 2 to nodex22014-1024 > 1024: finishing write tid 3 to nodex22014-1024 > 1024: finishing write tid 5 to nodex22014-1024 > 1024: finishing write tid 6 to nodex22014-1024 > update_object_version oid 1024 v 753 (ObjNum 1023 snap 0 seq_num 1023) > dirty exists > 1024: left oid 1024 (ObjNum 1023 snap 0 seq_num 1023) > 982: done (44 left) > 983: done (43 left) > 984: done (42 left) > 985: done (41 left) > 986: done (40 left) > 987: done (39 left) > 988: done (38 left) > 989: done (37 left) > 990: done (36 left) > 991: done (35 left) > 992: done (34 left) > 993: done (33 left) > 994: done (32 left) > 995: done (31 left) > 996: done (30 left) > 997: done (29 left) > 998: done (28 left) > 999: done (27 left) > 1000: done (26 left) > 1001: done (25 left) > 1002: done (24 left) > 1003: done (23 left) > 1004: done (22 left) > 1005: done (21 left) > 1006: done (20 left) > 1007: done (19 left) > 1008: done (18 left) > 1009: done (17 left) > 1010: done (16 left) > 1011: done (15 left) > 1012: done (14 left) > 1013: done (13 left) > 1014: done (12 left) > 1015: done (11 left) > 1016: done (10 left) > 1017: done (9 left) > 1018: done (8 left) > 1019: done (7 left) > 1020: done (6 left) > 1021: done (5 left) > 1022: done (4 left) > 1023: done (3 left) > 1024: done (2 left) > 1025: done (1 left) > 1026: done (0 left) > 1027: delete oid 101 current snap is 0 > 1027: done (0 left) > 1028: read oid 156 snap -1 > 1028: expect (ObjNum 155 snap 0 seq_num 155) > 1029: read oid 691 snap -1 > 1029: expect (ObjNum 690 snap 0 seq_num 690) > 1030: read oid 282 snap -1 > 1030: expect (ObjNum 281 snap 0 seq_num 281) > 1031: read oid 979 snap -1 > 1031: expect (ObjNum 978 snap 0 seq_num 978) > 1032: read oid 203 snap -1 > 1032: expect (ObjNum 202 snap 0 seq_num 202) > 1033: setattr oid 464 current snap is 0 > 1032: Error: oid 203 read returned error code -2 > ./test/osd/RadosModel.h: In function 'virtual void > ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fafee64a700 time > 2016-03-17 10:53:44.291343 > ./test/osd/RadosModel.h: 1109: FAILED assert(0) > ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x76) [0x4db956] > 2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c] > 3: (()+0x9791d) [0x7faff721191d] > 4: (()+0x72519) [0x7faff71ec519] > 5: (()+0x13c178) [0x7faff72b6178] > 6: (()+0x80a4) [0x7faff6d4a0a4] > 7: (clone()+0x6d) [0x7faff56c504d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > terminate called after throwing an instance of 'ceph::FailedAssertion' > Aborted > > nodey: > > 974: done (52 left) > 975: done (51 left) > 976: done (50 left) > 977: done (49 left) > 978: done (48 left) > 979: done (47 left) > 980: done (46 left) > 981: done (45 left) > 982: done (44 left) > 983: done (43 left) > 984: done (42 left) > 985: done (41 left) > 986: done (40 left) > 987: done (39 left) > 988: done (38 left) > 989: done (37 left) > 990: done (36 left) > 991: done (35 left) > 992: done (34 left) > 993: done (33 left) > 994: done (32 left) > 995: done (31 left) > 996: done (30 left) > 997: done (29 left) > 998: done (28 left) > 999: done (27 left) > 1000: done (26 left) > 1001: done (25 left) > 1002: done (24 left) > 1003: done (23 left) > 1004: done (22 left) > 1005: done (21 left) > 1006: done (20 left) > 1007: done (19 left) > 1008: done (18 left) > 1009: done (17 left) > 1010: done (16 left) > 1011: done (15 left) > 1012: done (14 left) > 1013: done (13 left) > 1014: done (12 left) > 1015: done (11 left) > 1016: done (10 left) > 1017: done (9 left) > 1018: done (8 left) > 1019: done (7 left) > 1020: done (6 left) > 1021: done (5 left) > 1022: done (4 left) > 1023: done (3 left) > 1024: done (2 left) > 1025: done (1 left) > 1026: done (0 left) > 1027: delete oid 101 current snap is 0 > 1027: done (0 left) > 1028: read oid 156 snap -1 > 1028: expect (ObjNum 155 snap 0 seq_num 155) > 1029: read oid 691 snap -1 > 1029: expect (ObjNum 690 snap 0 seq_num 690) > 1030: read oid 282 snap -1 > 1030: expect (ObjNum 281 snap 0 seq_num 281) > 1031: read oid 979 snap -1 > 1031: expect (ObjNum 978 snap 0 seq_num 978) > 1032: read oid 203 snap -1 > 1032: expect (ObjNum 202 snap 0 seq_num 202) > 1033: setattr oid 464 current snap is 0 > 1028: Error: oid 156 read returned error code -2 > ./test/osd/RadosModel.h: In function 'virtual void > ReadOp::_finish(TestOp::CallbackInfo*)' thread 7f55fa3c6700 time > 2016-03-17 10:53:57.082571 > ./test/osd/RadosModel.h: 1109: FAILED assert(0) > ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x76) [0x4db956] > 2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c] > 3: (()+0x9791d) [0x7f5602f8d91d] > 4: (()+0x72519) [0x7f5602f68519] > 5: (()+0x13c178) [0x7f5603032178] > 6: (()+0x80a4) [0x7f5602ac60a4] > 7: (clone()+0x6d) [0x7f560144104d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > terminate called after throwing an instance of 'ceph::FailedAssertion' > Aborted > > nodez: > > 1014: done (11 left) > 1026: delete oid 717 current snap is 0 > 1015: finishing write tid 2 to nodez24249-1015 > 1015: finishing write tid 4 to nodez24249-1015 > 1015: finishing write tid 5 to nodez24249-1015 > update_object_version oid 1015 v 3003 (ObjNum 1014 snap 0 seq_num > 1014) dirty exists > 1015: left oid 1015 (ObjNum 1014 snap 0 seq_num 1014) > 1016: finishing write tid 1 to nodez24249-1016 > 1016: finishing write tid 2 to nodez24249-1016 > 1016: finishing write tid 3 to nodez24249-1016 > 1016: finishing write tid 4 to nodez24249-1016 > 1016: finishing write tid 6 to nodez24249-1016 > 1016: finishing write tid 7 to nodez24249-1016 > update_object_version oid 1016 v 1201 (ObjNum 1015 snap 0 seq_num > 1015) dirty exists > 1016: left oid 1016 (ObjNum 1015 snap 0 seq_num 1015) > 1017: finishing write tid 1 to nodez24249-1017 > 1017: finishing write tid 2 to nodez24249-1017 > 1017: finishing write tid 3 to nodez24249-1017 > 1017: finishing write tid 5 to nodez24249-1017 > 1017: finishing write tid 6 to nodez24249-1017 > update_object_version oid 1017 v 3007 (ObjNum 1016 snap 0 seq_num > 1016) dirty exists > 1017: left oid 1017 (ObjNum 1016 snap 0 seq_num 1016) > 1018: finishing write tid 1 to nodez24249-1018 > 1018: finishing write tid 2 to nodez24249-1018 > 1018: finishing write tid 3 to nodez24249-1018 > 1018: finishing write tid 4 to nodez24249-1018 > 1018: finishing write tid 6 to nodez24249-1018 > 1018: finishing write tid 7 to nodez24249-1018 > update_object_version oid 1018 v 1283 (ObjNum 1017 snap 0 seq_num > 1017) dirty exists > 1018: left oid 1018 (ObjNum 1017 snap 0 seq_num 1017) > 1019: finishing write tid 1 to nodez24249-1019 > 1019: finishing write tid 2 to nodez24249-1019 > 1019: finishing write tid 3 to nodez24249-1019 > 1019: finishing write tid 5 to nodez24249-1019 > 1019: finishing write tid 6 to nodez24249-1019 > update_object_version oid 1019 v 999 (ObjNum 1018 snap 0 seq_num 1018) > dirty exists > 1019: left oid 1019 (ObjNum 1018 snap 0 seq_num 1018) > 1020: finishing write tid 1 to nodez24249-1020 > 1020: finishing write tid 2 to nodez24249-1020 > 1020: finishing write tid 3 to nodez24249-1020 > 1020: finishing write tid 5 to nodez24249-1020 > 1020: finishing write tid 6 to nodez24249-1020 > update_object_version oid 1020 v 813 (ObjNum 1019 snap 0 seq_num 1019) > dirty exists > 1020: left oid 1020 (ObjNum 1019 snap 0 seq_num 1019) > 1021: finishing write tid 1 to nodez24249-1021 > 1021: finishing write tid 2 to nodez24249-1021 > 1021: finishing write tid 3 to nodez24249-1021 > 1021: finishing write tid 5 to nodez24249-1021 > 1021: finishing write tid 6 to nodez24249-1021 > update_object_version oid 1021 v 1038 (ObjNum 1020 snap 0 seq_num > 1020) dirty exists > 1021: left oid 1021 (ObjNum 1020 snap 0 seq_num 1020) > 1022: finishing write tid 1 to nodez24249-1022 > 1022: finishing write tid 2 to nodez24249-1022 > 1022: finishing write tid 3 to nodez24249-1022 > 1022: finishing write tid 5 to nodez24249-1022 > 1022: finishing write tid 6 to nodez24249-1022 > update_object_version oid 1022 v 781 (ObjNum 1021 snap 0 seq_num 1021) > dirty exists > 1022: left oid 1022 (ObjNum 1021 snap 0 seq_num 1021) > 1023: finishing write tid 1 to nodez24249-1023 > 1023: finishing write tid 2 to nodez24249-1023 > 1023: finishing write tid 3 to nodez24249-1023 > 1023: finishing write tid 5 to nodez24249-1023 > 1023: finishing write tid 6 to nodez24249-1023 > update_object_version oid 1023 v 1537 (ObjNum 1022 snap 0 seq_num > 1022) dirty exists > 1023: left oid 1023 (ObjNum 1022 snap 0 seq_num 1022) > 1024: finishing write tid 1 to nodez24249-1024 > 1025: Error: oid 230 read returned error code -2 > ./test/osd/RadosModel.h: In function 'virtual void > ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fd9bb7fe700 time > 2016-03-17 10:53:41.757921 > ./test/osd/RadosModel.h: 1109: FAILED assert(0) > ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x76) [0x4db956] > 2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c] > 3: (()+0x9791d) [0x7fd9d088d91d] > 4: (()+0x72519) [0x7fd9d0868519] > 5: (()+0x13c178) [0x7fd9d0932178] > 6: (()+0x80a4) [0x7fd9d03c60a4] > 7: (clone()+0x6d) [0x7fd9ced4104d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > terminate called after throwing an instance of 'ceph::FailedAssertion' > Aborted > > nodezz: > > 1015: finishing write tid 1 to nodezz25161-1015 > 1015: finishing write tid 2 to nodezz25161-1015 > 1015: finishing write tid 4 to nodezz25161-1015 > 1015: finishing write tid 5 to nodezz25161-1015 > update_object_version oid 1015 v 900 (ObjNum 1014 snap 0 seq_num 1014) > dirty exists > 1015: left oid 1015 (ObjNum 1014 snap 0 seq_num 1014) > 1016: finishing write tid 1 to nodezz25161-1016 > 1016: finishing write tid 2 to nodezz25161-1016 > 1016: finishing write tid 3 to nodezz25161-1016 > 1016: finishing write tid 4 to nodezz25161-1016 > 1016: finishing write tid 6 to nodezz25161-1016 > 1016: finishing write tid 7 to nodezz25161-1016 > update_object_version oid 1016 v 1021 (ObjNum 1015 snap 0 seq_num > 1015) dirty exists > 1016: left oid 1016 (ObjNum 1015 snap 0 seq_num 1015) > 1017: finishing write tid 1 to nodezz25161-1017 > 1017: finishing write tid 2 to nodezz25161-1017 > 1017: finishing write tid 3 to nodezz25161-1017 > 1017: finishing write tid 5 to nodezz25161-1017 > 1017: finishing write tid 6 to nodezz25161-1017 > update_object_version oid 1017 v 3011 (ObjNum 1016 snap 0 seq_num > 1016) dirty exists > 1017: left oid 1017 (ObjNum 1016 snap 0 seq_num 1016) > 1018: finishing write tid 1 to nodezz25161-1018 > 1018: finishing write tid 2 to nodezz25161-1018 > 1018: finishing write tid 3 to nodezz25161-1018 > 1018: finishing write tid 4 to nodezz25161-1018 > 1018: finishing write tid 6 to nodezz25161-1018 > 1018: finishing write tid 7 to nodezz25161-1018 > update_object_version oid 1018 v 1099 (ObjNum 1017 snap 0 seq_num > 1017) dirty exists > 1018: left oid 1018 (ObjNum 1017 snap 0 seq_num 1017) > 1019: finishing write tid 1 to nodezz25161-1019 > 1019: finishing write tid 2 to nodezz25161-1019 > 1019: finishing write tid 3 to nodezz25161-1019 > 1019: finishing write tid 5 to nodezz25161-1019 > 1019: finishing write tid 6 to nodezz25161-1019 > update_object_version oid 1019 v 1300 (ObjNum 1018 snap 0 seq_num > 1018) dirty exists > 1019: left oid 1019 (ObjNum 1018 snap 0 seq_num 1018) > 1020: finishing write tid 1 to nodezz25161-1020 > 1020: finishing write tid 2 to nodezz25161-1020 > 1020: finishing write tid 3 to nodezz25161-1020 > 1020: finishing write tid 5 to nodezz25161-1020 > 1020: finishing write tid 6 to nodezz25161-1020 > update_object_version oid 1020 v 1324 (ObjNum 1019 snap 0 seq_num > 1019) dirty exists > 1020: left oid 1020 (ObjNum 1019 snap 0 seq_num 1019) > 1021: finishing write tid 1 to nodezz25161-1021 > 1021: finishing write tid 2 to nodezz25161-1021 > 1021: finishing write tid 3 to nodezz25161-1021 > 1021: finishing write tid 5 to nodezz25161-1021 > 1021: finishing write tid 6 to nodezz25161-1021 > update_object_version oid 1021 v 890 (ObjNum 1020 snap 0 seq_num 1020) > dirty exists > 1021: left oid 1021 (ObjNum 1020 snap 0 seq_num 1020) > 1022: finishing write tid 1 to nodezz25161-1022 > 1022: finishing write tid 2 to nodezz25161-1022 > 1022: finishing write tid 3 to nodezz25161-1022 > 1022: finishing write tid 5 to nodezz25161-1022 > 1022: finishing write tid 6 to nodezz25161-1022 > update_object_version oid 1022 v 464 (ObjNum 1021 snap 0 seq_num 1021) > dirty exists > 1022: left oid 1022 (ObjNum 1021 snap 0 seq_num 1021) > 1023: finishing write tid 1 to nodezz25161-1023 > 1023: finishing write tid 2 to nodezz25161-1023 > 1023: finishing write tid 3 to nodezz25161-1023 > 1023: finishing write tid 5 to nodezz25161-1023 > 1023: finishing write tid 6 to nodezz25161-1023 > update_object_version oid 1023 v 1516 (ObjNum 1022 snap 0 seq_num > 1022) dirty exists > 1023: left oid 1023 (ObjNum 1022 snap 0 seq_num 1022) > 1024: finishing write tid 1 to nodezz25161-1024 > 1024: finishing write tid 2 to nodezz25161-1024 > 1025: Error: oid 219 read returned error code -2 > ./test/osd/RadosModel.h: In function 'virtual void > ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fbb1bfff700 time > 2016-03-17 10:53:53.071338 > ./test/osd/RadosModel.h: 1109: FAILED assert(0) > ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x76) [0x4db956] > 2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c] > 3: (()+0x9791d) [0x7fbb30ff191d] > 4: (()+0x72519) [0x7fbb30fcc519] > 5: (()+0x13c178) [0x7fbb31096178] > 6: (()+0x80a4) [0x7fbb30b2a0a4] > 7: (clone()+0x6d) [0x7fbb2f4a504d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > terminate called after throwing an instance of 'ceph::FailedAssertion' > Aborted > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Thu, Mar 17, 2016 at 10:39 AM, Sage Weil <sweil@xxxxxxxxxx> wrote: >> On Thu, 17 Mar 2016, Robert LeBlanc wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA256 >>> >>> I'm having trouble finding documentation about using ceph_test_rados. Can I >>> run this on the existing cluster and will that provide useful info? It seems >>> running it in the build will not have the caching set up (vstart.sh). >>> >>> I have accepted a job with another company and only have until Wednesday to >>> help with getting information about this bug. My new job will not be using C >>> eph, so I won't be able to provide any additional info after Tuesday. I want >>> to leave the company on a good trajectory for upgrading, so any input you c >>> an provide will be helpful. >> >> I'm sorry to hear it! You'll be missed. :) >> >>> I've found: >>> >>> ./ceph_test_rados --op read 100 --op write 100 --op delete 50 >>> - --max-ops 400000 --objects 1024 --max-in-flight 64 --size 4000000 >>> - --min-stride-size 400000 --max-stride-size 800000 --max-seconds 600 >>> - --op copy_from 50 --op snap_create 50 --op snap_remove 50 --op >>> rollback 50 --op setattr 25 --op rmattr 25 --pool unique_pool_0 >>> >>> Is that enough if I change --pool to the cached pool and do the toggling whi >>> le ceph_test_rados is running? I think this will run for 10 minutes. >> >> Precisely. You can probably drop copy_from and snap ops from the list >> since your workload wasn't exercising those. >> >> Thanks! >> sage >> >> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com