Re: unfound object problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



http://tracker.ceph.com/issues/16185

So, it does seem to be a bug, and I've got a fix.  However, it's not
clear to me that it would result in an unfound object.  It seems like
it would result in a pg which should be down being allowed to peer or
an actually unfound object being erroneously considered ok.  I'm not
sure you've diagnosed the original issue correctly.  If you can
reproduce, you should enable logging (debug osd = 20, debug filestore
= 20, debug ms = 1) and go through the logs more carefully.
-Sam

On Tue, Jun 7, 2016 at 7:52 AM, Samuel Just <sjust@xxxxxxxxxx> wrote:
> That does sound like a bug, I'll try to take a look today.
> -Sam
>
> On Mon, Jun 6, 2016 at 9:21 PM, Rui Xie <jerry.xr86@xxxxxxxxx> wrote:
>> Hi Sam
>>
>> we do not check pgtemp  map_epoch in preprocess_pgtemp and prepare_pgtemp.
>> old pgtemp messages with smaller map_epoch are prapared, and update
>> up_thru to smaller version.
>>
>> a lot of duplicate pgtemp messages there.
>>
>> 2016-06-07 6:07 GMT+08:00 Samuel Just <sjust@xxxxxxxxxx>:
>>> I don't quite understand...  Can you explain the sequence of events in
>>> more detail?
>>> -Sam
>>>
>>> On Mon, Jun 6, 2016 at 2:14 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>>>> On Sun, May 29, 2016 at 8:58 PM, Rui Xie <jerry.xr86@xxxxxxxxx> wrote:
>>>>> Hi
>>>>>
>>>>> I found an unfound object problem in my test environment (hammer).
>>>>> I suspect the reason is the wrong update of up_thru.
>>>>>
>>>>> from osdmap, the up_thru become smaller than before at an epoch.  some
>>>>> old PGTemp messages with smaller epoch are prepared and executed, and
>>>>> change the up_thru to smaller epoch.
>>>>> maybe_went_rw is wrong for that interval.
>>>>>
>>>>> I think prepare_pgtemp should not change up_thru if it is smaller than
>>>>> current, and duplicated PGTemp messages not be sent ?
>>>>>
>>>>> Is this a bug or something wrong for me?
>>>>>
>>>>> Thanks !
>>>>
>>>> I'm not sure if this came out of the same place or not, but Sam was
>>>> just talking last week about an issue with pgtemp updates that is at
>>>> least close to this bug. That one was resolved in the OSDMonitor, if
>>>> those are the up_thru locations you're talking about. :)
>>>> -Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux