Re: rados semantic changes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 9, 2016 at 1:47 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Wed, 9 Mar 2016, Gregory Farnum wrote:
>> On Wed, Mar 9, 2016 at 12:42 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>> > Resurrecting an old thread.
>> >
>> > I think we really want to make these semantic changes to current rados
>> > ops (like delete) to make life better going forward.  Ideally shortly
>> > after jewel so that they have plenty of time to bake before K and L.
>> >
>> > I'm wondering if the way to make this change visible to users is to
>> > (finally) rev librados to librados3.  We can take the opportunity to make
>> > any other pending cleanups to the public API as well...
>>
>> Yep. I presume you're thinking of this because of
>> http://tracker.ceph.com/issues/14468? It looks like we didn't really
>> have any good solutions for that pipelining problem though; any new
>> suggestions?
>
> Yeah, I'm still not very happy with either alternative:
>
> 1) We persistently record the reqid and return value in the pg log.  This
> turns failed rw ops into a replicated (metadata) write, which sort of
> sucks.  It also means that we probably *wouldn't* store any reply payload,
> which means we lose the ability to have a failure return useful data
> (e.g., info about why it failed).

This inability to return data on writes has pretty persistently sucked
for us... I wonder if we should be attacking it from that direction
instead. We just don't want pglog entries to get that large and are
worried about being able to reproduce the data on replay, right?
Perhaps we could add some kind of limited-size lookaside thing. Given
that RW ops *are* a write on success (whatever "success" means in the
op's context) I'm not so concerned about turning them into writes even
if they would have been a read. The other option is #2, which as you
note might have some serious performance implications on the client
side. :/
-Greg

>
> 2) The objecter prevents rw ops from being pipelined.  This means a hash
> table in the objecter so that it transparently blocks subsequent requests
> to the same object.  Or,
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux