Re: set_alloc_hint old osds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 12, 2014 at 1:21 AM, Samuel Just <sam.just@xxxxxxxxxxx> wrote:
> Yeah, so that's part of it.  The larger question is whether it's ok
> for the client to indiscriminately send that op in the first place.

FWIW, I think it's got to be.  We don't control all the clients, and
I believe I mentioned this to Sage or Josh a while back.  We set FAILOK
to make older OSDs ignore alloc hint op, but that of course that
doesn't help if it's (one of) the replica OSDs that is older.  When
merging alloc hint, it was understood that if there are any older OSDs
in the acting set they will crash in FileStore, but nothing was done
about it..

The full feature bit sounded like an overkill, especially given that
alloc hint doesn't affect the data layout, older OSDs can still read
and write fine, and after all it's just a hint.  Having the primary
return -EOPNOTSUPP based on lists of supported ops sounds to me like
a good idea, both for alloc hint op and future ops.

Thanks,

                Ilya


> -Sam
>
> On Thu, Sep 11, 2014 at 2:05 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>> Oh, in that case the peers could just share their supported ops with
>> the primary or something (like we do with mon commands). That sounds
>> good to me, anyway?
>> -Greg
>>
>> On Thu, Sep 11, 2014 at 1:46 PM, Samuel Just <sam.just@xxxxxxxxxxx> wrote:
>>> No, we don't put the transaction into the pg log.
>>> -Sam
>>>
>>> On Thu, Sep 11, 2014 at 1:40 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>>> Does the hint not go into the pg log? Which could be retried on an older OSD?
>>>>
>>>> On Thu, Sep 11, 2014 at 1:33 PM, Samuel Just <sam.just@xxxxxxxxxxx> wrote:
>>>>> That part is harmless, the transaction would be recreated for the new
>>>>> acting set taking into account the new acting set features.  It
>>>>> doesn't have any actual affect on the contents of the object.
>>>>> -Sam
>>>>>
>>>>> On Thu, Sep 11, 2014 at 1:30 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>>>>> On Thu, Sep 11, 2014 at 1:19 PM, Samuel Just <sam.just@xxxxxxxxxxx> wrote:
>>>>>>> http://tracker.ceph.com/issues/9419
>>>>>>>
>>>>>>> librbd unconditionally sends set_alloc_hint.  Do we require that users
>>>>>>> upgrade the osds first?  Also, should the primary respond with
>>>>>>> ENOTSUPP if any replicas don't support it?
>>>>>>
>>>>>> Something closer to the second option, I think...but then you run into
>>>>>> the problem where maybe the PG gets moved from a set of new OSDs to a
>>>>>> set of old ones that don't support the op. :/ I think for anything
>>>>>> that goes to disk you need to go through a full features-in-the-osdmap
>>>>>> process like we did for erasure coding.
>>>>>> -Greg
>>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux