Re: chaos monkeys

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



<also moved to ceph-devel>
On Tue, Oct 9, 2012 at 9:59 AM, Sam Lang <sam.lang@xxxxxxxxxxx> wrote:
> On 10/09/2012 11:46 AM, Gregory Farnum wrote:
>>
>> On Tue, Oct 9, 2012 at 9:43 AM, Sam Lang <sam.lang@xxxxxxxxxxx> wrote:
>>>
>>>
>>> Could we add some other chaos monkeys to the network/storage
>>> infrastructure
>>> besides ms_inject_socket_failures?  In particular, I would like to add
>>> ms_inject_delay_msg and ms_inject_reorder_msgs?  I think those could
>>> potentially help flush out some bugs (such as:
>>>
>>> https://github.com/ceph/ceph/commit/fa66eaa162542ac01752ada91a46051dde060831).
>>
>>
>> You're going to have to explain these more — ordered delivery over a
>> connection is one of the guarantees that the messaging layer provides,
>> so that doesn't sound like a configurable we're going to add.
>
>
> That's true, but there's no guarantee that the source will always send them
> in the same order.  The bug I linked above is a good example, the mds was
> sending out two messages, one the open session reply, and another the stale
> session async message.  The bug is only expressed when the stale comes
> before the open session, which is possible in some cases.  The stale
> originates from a timer expiring, and the open session is sent after the
> journal commit, so the timing (and ordering) of those two messages can vary
> based on when the timer thread gets scheduled to execute, how long the
> journal commit takes, etc.
>
> Reordering messages at the destination would act to simulate all the
> asynchronous paths like this that exist in our code.

The sending messenger also maintains ordering invariants. The endpoint
(the MDS) might not dispatch them in the same order all the time, but
that's at a different semantic layer and is not something we can
simulate inside the messenger — it requires semantic knowledge of
which messages are okay to reorder. If we just did random reordering
like you're suggesting, absolutely everything would break.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux