Re: wip-crush

Gregory Farnum <greg@xxxxxxxxxxx> · Wed, 22 Aug 2012 10:04:22 -0700

On Wed, Aug 22, 2012 at 9:33 AM, Sage Weil <sage@xxxxxxxxxxx> wrote:
> On Wed, 22 Aug 2012, Atchley, Scott wrote:
>> On Aug 22, 2012, at 10:46 AM, Florian Haas wrote:
>>
>> > On 08/22/2012 03:10 AM, Sage Weil wrote:
>> >> I pushed a branch that changes some of the crush terminology.  Instead of
>> >> having a crush type called "pool" that requires you to say things like
>> >> "pool=default" in the "ceph osd crush set ..." command, it uses "root"
>> >> instead.  That hopefully reinforces that it is a tree/hierarchy.
>> >>
>> >> There is also a patch that changes "bucket" to "node" throughout, since
>> >> bucket is a term also used by radosgw.
>> >>
>> >> Thoughts?  I think the main pain in making this transition is that old
>> >> clusters have maps that have a type 'pool' and new ones won't, and the
>> >> docs will need to walk people through both...
>> >
>> > "pool" in a crushmap being completely unrelated to a RADOS pool is
>> > something that I've heard customers/users report as confusing, as well.
>> > So changing that is probably a good thing. Naming it "root" is probably
>> > a good choice as well, as it happens to match
>> > http://ceph.com/wiki/Custom_data_placement_with_CRUSH.
>> >
>> > As for changing "bucket" to node... a "node" is normally simply a
>> > physical server (at least in HA terminology, which many potential Ceph
>> > users will be familiar with), and CRUSH uses "host" for that. So that's
>> > another recipe for confusion. How about using something super-generic,
>> > like "element" or "item"?
>> >
>> > Cheers,
>> > Florian
>>
>> My guess is that he is trying to use data structure tree nomenclature
>> (root, node, leaf). I agree that node is an overloaded term (as is
>> pool).
>
> Yeah...
>
>> As for an alternative to bucket which indicates the item is a
>> collection, what about subtree or branch?
>
> I think fixing the overloading of 'pool' in the default crush map is the
> biggest pain point.  I can live with crush 'buckets' staying the same (esp
> since that's what the papers and code use pervasively) if we can't come up
> with a better option.

I'm definitely most interested in replacing "pool", and "root" works
for that in my mind. RGW buckets live at a sufficiently different
level that I think people are unlikely to be confused — and "bucket"
is actually a good name for what they are (I'm open to better ones,
but I don't think that "node" qualifies).

> On the pool part, though, the challenge is how to transition.  Existing
> clusters have maps that use 'pool', and new clusters will use 'root' (or
> whatever).  Some options:
>
>  - document both.  this kills much of the benefit of switching, but is
>    probably inevitable since people will be running different versions.
>  - make the upgrade process transparently rename the type.  this lets
>    all the tools use the new names.
>  - make the tools silently translate old names to new names.  this is
>    kludgey in that it makes the code make assumptions about the names of
>    the data it is working with, but would cover everyone except those who
>    created their own crush maps from scratch.
>  - ?
I would go with option two, and only document the new options — I
wouldn't be surprised if the number of people who had changed those
was zero. Anybody who has done so can certainly be counted on to pay
enough attention that a line note "changed CRUSH names (see here if
you customized your map)" would be sufficient, right?
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html