How did you notice/conclude these are problems? I don't have specific details about how objects_read_async_no_cache() is handled through the code, but the erasure code read paths can get pretty obtuse and multi-layered; in some paths we know there can't be errors but in some there might be. As for the return codes of writes, they definitely can be filled in with error codes in cases where that's appropriate. But by the time you get to submitting a write transaction to the disk or backend, it's going to pass — the only failure mode that can happen at that point (actual EIO or other error from disk) is one that results in the OSD suiciding. -Greg On Sat, Nov 10, 2018 at 7:40 AM cui xiao fei Cui <thinkercui@xxxxxxxxx> wrote: > > Hi all, > We are very confused with two problem of rmw. > > 1. the return code of read is not handled. > > The callback of read is like this. > ECBackend::try_state_to_reads > ... > objects_read_async_no_cache( > op->remote_read, > [this, op](map<hobject_t,pair<int, extent_map> > &&results) { > for (auto &&i: results) { > op->remote_read_result.emplace(i.first, i.second.second); > } > check_ops(); > }); > > The read return code in pair<int, extent_map> is never considered. > Shall we detect the return code and cancel the write op immediately, if > there are read errors, to prevent later assert? > > 2. the return code of write is always 0. > The write op reply is like this. > > PrimaryLogPG::execute_ctx > ... > ctx->reply = new MOSDOpReply(m, 0, get_osdmap()->get_epoch(), 0, > successful_write); > ... > ctx->register_on_commit( > [m, ctx, this](){ > ... > if (m && !ctx->sent_reply) { > MOSDOpReply *reply = ctx->reply; > if (reply) > ctx->reply = nullptr; > else { > reply = new MOSDOpReply(m, 0, > get_osdmap()->get_epoch(), 0, true); > reply->set_reply_versions(ctx->at_version, > ctx->user_at_version); > } > osd->send_message_osd_client(reply, m->get_connection()); > } > ... > }); > > We find that the return code of write will always 0, if there is no > error occured > at the stage of prepare_transaction. > Shall we return an error code to tell the client there are something > bad happend?