One of the failure we see in qa during osd thrashing is failure from commands like 'ceph osd out 2'. A transient (say, network) error can prevent the ceph command from getting the reply, and when it is resent we get EINVAL 'already marked down' or similar. In general, should we make these commands return success in those cases? Or should callers be prepared to tolerate those sorts of errors? The in/out/down ones seem pretty straightward to me. A tricker one is 'ceph osd create 123', which currently fails if 123 already exists (and you thus do not get to use that id). Not that we do that; the chef stuff will do 'ceph osd create' and a newly allocated id will be part of the reply. Thoughts? sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html