Re: Ann Arbor Team's Flexible I/O Proposals (Ceph Next)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> The complicating case here is the OSD status set.  Running this
> through a single Paxos limits the number of OSDs that can coexist in a
> cluster.  We ought split the set of OSDs between multiple masters to

Does multiple masters you mentioned here mean logical OSDs running in
one osd process described in # Interfaces #?

> distribute the load. Each 'Up' or 'Down' event is independent of
> others, so all we require is that events get propagated into the
> correct OSDs and primaries and followers act as they're supposed to.
> 
> Versioning is a bigger problem here. We might have all masters
> increment their version when one increments its version if that could
> be managed without inefficiency. We might send a compound version with
> `MOSDOp`s, but combining that with the compound version above might be
> unwieldly. (Feedback on this issue would be greatly appreciated.)

Cheers,
S

----- Original Message -----
> From: "Adam C. Emerson" <aemerson@xxxxxxxxxx>
> To: "The Sacred Order of the Squid Cybernetic" <ceph-devel@xxxxxxxxxxxxxxx>
> Sent: Friday, April 15, 2016 5:05:37 PM
> Subject: Ann Arbor Team's Flexible I/O Proposals (Ceph Next)
>
> Ceph Developers,
>
> We've put together a few of the main ideas from our previous work in a
> brief form that we hope people will be able to digest, consider, and
> debate. We'd also like to discuss them with you at Ceph Next this
> Tuesday.
>

>
> ## The OSD Set ##
>
> The complicating case here is the OSD status set.  Running this
> through a single Paxos limits the number of OSDs that can coexist in a
> cluster.  We ought split the set of OSDs between multiple masters to
> distribute the load. Each 'Up' or 'Down' event is independent of
> others, so all we require is that events get propagated into the
> correct OSDs and primaries and followers act as they're supposed to.
>
> Versioning is a bigger problem here. We might have all masters
> increment their version when one increments its version if that could
> be managed without inefficiency. We might send a compound version with
> `MOSDOp`s, but combining that with the compound version above might be
> unwieldly. (Feedback on this issue would be greatly appreciated.)

When Tom Keiser and I considered the problem of distributing AFS3 data
for a single vnode across multiple data servers, iirc, we both arrived
at the notion of compound DataVersion (or "range dv") as the extension
of DataVersion to the partitioned object.

It feels like a similar structure naturally arises here, I admit I have
not thought about this problem in a while.

Regards,

Matt

--
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-707-0660
fax.  734-769-8938
cel.  734-216-5309
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Email:
shinobu@xxxxxxxxx
GitHub:
shinobu-x
Blog:
Life with Distributed Computational System based on OpenSource
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux