Re: Asking bug ceph 15113

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



resending to mailing list. was rejected due to non plain-text mail.

On Fri, Oct 21, 2016 at 5:38 PM, kefu chai <tchaikov@xxxxxxxxx> wrote:
>
>
> On Monday, October 10, 2016, agung Laksono <agung.smarts@xxxxxxxxx> wrote:
>>
>> Thank you for the answer..
>>
>>
>> Could you tell me a bit about the peer.
>>
>> Does the peer means a client?
>
>
> it's very likely an osd.
>
>> when I run:
>> agung@ceph:~/project/infernalis/src$ ag --cpp ms_handle_reset
>>
>> the result shows so many places call ms_handle_reset.
>> But I am not sure which one that trigger Monitor:ms_handle_reset.
>
>
> it must be somewhere in messenger subsystem. you can set a breakpoint at
> ms_handle_reset() in gdb, and connectthe monitor with ceph cli, then kill
> it.
>
>>
>>
>>
>>
>> On Sat, Oct 8, 2016 at 10:52 AM, kefu chai <tchaikov@xxxxxxxxx> wrote:
>>>
>>> + ceph-devel
>>>
>>> On Thu, Sep 29, 2016 at 2:35 PM, agung Laksono <agung.smarts@xxxxxxxxx>
>>> wrote:
>>> >
>>> > I would like to ask you relate to ceph-15113. On the bug description,
>>> > the scenario to reproduce the bug is:
>>>
>>> this is not how the bug is reproduces, it's just my analysis of the
>>> root cause. IMO,
>>> it would be a tricky to reproduce this racing if possible.
>>>
>>> >
>>> > so the session was not removed, that's why the request was handled
>>> > after the
>>> > connection is reset. this is a race condition:
>>> >
>>> > ______________________ SafeTimer::timer_thread(), with mon_lock:
>>> > ______________________ elector: in win_election(), it
>>> > resend_routed_requests(), and collects the routed requests
>>> > msgr: in ms_handle_reset(), it reset the session
>>> > msgr: it waits for the lock
>>> > ______________________ elector: in win_election(), it handle_command(),
>>> > but
>>> > the session is reset, hence it panics.
>>> > msgr: remove session, and erase related requests from
>>> > Monitor::routed_requests.
>>> >
>>> >
>>> > I try to reproduce this bug on a cluster in my local machine and
>>> > find difficulty when reproducing ms_handle_reset.
>>> >
>>> > Does ms_handle_reset refer to Monitor::ms_handle_reset(Connection
>>> > *con)?
>>>
>>> yes.
>>>
>>> > How to trigger this function? On my study, I've put mon log when this
>>> > method
>>>
>>> when the peer resets the connection.
>>>
>>> > executed.
>>> > however, I saw that ms_handle_reset was called randomly. I mean this
>>> > function also be called
>>> > in several times when the system run.
>>> >
>>> >  Thank you in advance!
>>> >
>>> > --
>>> > Cheers,
>>> >
>>> > Agung Laksono
>>> >
>>>
>>>
>>>
>>> --
>>> Regards
>>> Kefu Chai
>>
>>
>>
>>
>> --
>> Cheers,
>>
>> Agung Laksono
>>
>
>
>
>
>
> --
> Regards
> Kefu Chai



-- 
Regards
Kefu Chai
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux