Re: a few rados blueprints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 25, 2013 at 4:28 PM, Sage Weil <sage@xxxxxxxxxxx> wrote:
> On Thu, 25 Jul 2013, Gregory Farnum wrote:
>> On Thu, Jul 25, 2013 at 4:01 PM, Sage Weil <sage@xxxxxxxxxxx> wrote:
>> > I've added a blueprint for avoiding double-writes when using btrfs:
>> >
>> >         http://wiki.ceph.com/01Planning/02Blueprints/Emperor/osd:_clone_from_journal_on_btrfs
>> >
>> > This should improve throughput significantly when the journal is a file in
>> > btrfs.
>> >
>> > ---
>> >
>> > Also, there's one for improving the localized read behavior:
>> >
>> >         http://wiki.ceph.com/01Planning/02Blueprints/Emperor/librados%2F%2Fobjecter%3A_smarter_localized_reads
>> >
>> > For example, for read-only parents of rbd clones, we may as well read from
>> > the replica in the same host or rack or row--whatever crush can tell
>> > us--and not the primary.  This is good for locality and load distribution
>> > when certain object sets are hot.
>>
>> This blueprint includes work items to set locality information in
>> libcephfs and via the Hadoop bindings. However, there's still a read
>> hole issue with read-from-replicas [1] that makes this generally
>> unwise. Did you consider that when writing this blueprint?
>> In particular I think we want to discuss if we allow people to use a
>> more powerful read-from-replica unless we can guarantee their usage of
>> it is safe (ie, snapshots).
>
> Yeah, there's an open bug for that, but the solution doesn't seem
> interesting enough to warrant a CDS discussion...
>
>         http://tracker.ceph.com/issues/5388
>
> But if I'm wrong, by all means write one! :)

I didn't think we had a solution yet, since your last words there are
"the fix on the OSD is going to be a bit more involved". :p
That doesn't mean we shouldn't do this, I just thought it was a
problem that needed to be part of the blueprint when designing and
implementing this, whether it's the user's problem to handle properly,
or we want to lock it out in ways we can be reasonably sure are safe,
or if we expect the local read issue to be resolved before this is
completed.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux