Re: Ordering subscription messages to MonClient vs. command responses

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 27, 2018 at 5:43 PM, John Spray <jspray@xxxxxxxxxx> wrote:
> On Tue, Mar 27, 2018 at 10:26 AM, kefu chai <tchaikov@xxxxxxxxx> wrote:
>> On Tue, Mar 20, 2018 at 6:45 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
>>> On Mon, 19 Mar 2018, Gregory Farnum wrote:
>>>> On Mon, Mar 19, 2018 at 7:33 AM, John Spray <jspray@xxxxxxxxxx> wrote:
>>>> > Hi all,
>>>> >
>>>> > I was looking at places in ceph-mgr where we send a command from a
>>>> > module, and then want to proceed with some logic that involves reading
>>>> > the osdmap (there is a local copy in the manager, maintained by
>>>> > Objecter).
>>>> >
>>>> > I had been thinking that we should include cluster map epochs in the
>>>> > MMonCommandAck messages so that the client can (optionally) wait for
>>>> > that latest OSDMap before it considers the command complete.
>>>> >
>>>> > Then I thought, maybe this isn't necessary at all, because the mons
>>>> > would be doing the check_subs() etc calls before they actually respond
>>>> > to commands, so clients would always get their updated maps before
>>>> > seeing a command response message.
>>>> >
>>>> > So: mon experts, what do you think?  Is it safe to assume that clients
>>>> > will get their subscription updates before a command completion (even
>>>> > in the case of commands being forwarded)?  Or do we maybe need a
>>>> > little bit more logic on the client side in the manager?
>>>>
>>>> I would expect the order of op replies versus subscription fulfillment
>>>> messages to be an implementation detail, even if we do currently spool
>>>> off new map subscription requests inline with committing them. (I
>>>> don’t know at all if that’s the case.)
>>>
>>> Currently all of the subs are satisfied by update_from_paxos(), which
>>> means they get fulfilled before any replies (which are waiting_for_commit
>>> completions). Having recently fixed one of the monitor services to do this
>>> that wasn't in order to fix a subsrciption bug, I'm pretty confident this
>>> is the "right" place to do it given how the mon is currently structured.
>>
>> if the command *updates* the status of monitor in the sense that it triggers a
>> proposal, i think it's safe to assume that the client which sends the command
>> will be updated with the latest osdmap. but if it just *queries* the
>> cluster status
>> from the mon, and the behavior of client depends on the osdmap, there is
>> a risk of racing. John, what specific ceph-mgr module or calling path was you
>> looking at?
>
> I was thinking specifically of updates.  This was in some code I'm
> working on that creates pools: need to make sure that after my "osd
> pool create" command is done, it'll be reflected in the local OSDMap.
>
> It seems like the consensus is that it would be worthwhile to include
> map epochs in the command response for commands that update things.
>

yeah, i feel the same.

> John
>
>>
>>>
>>> I think we have two options: acknowledge and enshrine this is part of the
>>> mon protocol as John suggests.  No code changes but some small risk of
>>> regretting this if the mon ever gets a complete rewrite.
>>>
>>> Or add epochs to the MonCommands so that clients can explicitly wait.
>>> There is almost precedent for this in that PaxosService messages (special
>>> purpose non-command messages) have an version in them and their replies
>>> generally include one as well.  It would take quite a bit of work to
>>> extend this to include commands, though, and even if we did there are some
>>> commands that span multiple services and thus have an ill-defined
>>> version/epoch to pin themselves to.  This would require a lot of work and
>>> at the end of the day would require extra code on the clients to be
>>> "correct"... code that would never actually be exercised because, in
>>> reality, the current mon implementation always returns the maps before the
>>> reply.
>>>
>>> I can't think of any reason why we'd opt for #2 given the opportunity
>>> cost.
>>>
>>> sage
>>
>>
>>
>> --
>> Regards
>> Kefu Chai



-- 
Regards
Kefu Chai
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux