On Wed, Jan 18, 2012 at 12:48 PM, Andrey Stepachev <octo47@xxxxxxxxx> wrote: > But still don't know what happens with ceph, so it can't > respond and hang. It is not a good behavior, because > such situation leads to unresponsible cluster in case of > temporal network failure. I'm a little concerned about this — I would expect to see hangs of up to ~30 seconds (the timeout period), but for operations to then continue. Are you putting the MDS down? If so, do you have any standbys specified? > 2012/1/18 Andrey Stepachev <octo47@xxxxxxxxx>: >> Hi, >> >> I've test ceph against laggy network. (0ms-400ms delays). >> At some moment i got many messages like: >> 2012-01-18 16:06:49.184776 7ff134119700 -- 84.201.161.73:6801/25424 >> send_message dropped message osd_op_reply(291 1000000101b.0000001e >> [write 66734080~37 >> 4784] ondisk = 0) v1 because of no pipe on con 0x315e640 >> And ceph don't respond on ls on some of subdirs (via hadoop fs -ls or >> kernel client) >> My cluster runs with no debug at that moment, so I can't find what is going on. >> >> After restart ceph writes to log >> 2012-01-18 16:10:39.985509 7f217989d780 osd.1 155 pg[0.155( v 136'373 >> (94'368,136'373]+backlog n=3 ec=1 les/c 150/145 146/151/58) [] r=0 >> lpr=0 (info mismatch, log(94'368,0'0]+backlog) (log bound mismatch, >> actual=[8'124,94'369]) lcod 0'0 mlcod 0'0 inactive] read_log got dup >> 94'369 (last was 94'369, dropping that one) >> >> After such strange hangouts i found, that rm -rf on filesystem >> (mounted via kernel), >> fs shows, that 210Gb still in use. Looking at /data/osd.x i found many >> objects inside. >> So: >> a) looks like some errors lead us to orphaned objects in rados >> b) i can't find utility, which can check that orpaned data (and cleanup it) >> >> Question: how I can identify what objects are, and how I can clean up them. >> >> -- >> Andrey. > > > > -- > Andrey. > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html