Re: defaults paths #2

Tommi Virtanen <tommi.virtanen@xxxxxxxxxxxxx> · Mon, 9 Apr 2012 11:03:44 -0700

On Fri, Apr 6, 2012 at 12:45, Bernard Grymonpon <bernard@xxxxxxxxxxxx> wrote:
> Lets go wild, and say, if you have hunderds of machines, summing up to thousands of of disks, all already migrated/moved to other machines/... , and it reports that OSD 536 is offline, how will you find what disk is failing/corrupt/... in which machine? Will you keep track which OSD ran on which node last?

That's a good question and I don't have a good enough answer for you
yet. Rest assured that's a valid concern.

It seems we're still approaching this from different angles. You want
to have an inventory of disks, known by uuid, and want to track where
they are, and plan their moves.

I want to know I have N servers with K hdd slots each, and I want each
one to be fully populated with healthy disks. I don't care what disk
is where, and I don't think it's realistic for me to maintain a manual
inventory. A failed disk means unplug that disk. An empty slot means
plug in a disk from the dedicated pile of spares. A chassis needing
maintenance is to be shut down, disks unplugged & plugged in
elsewhere. I don't care where. A lost disk needs to have its osd
deleted at some point (or just let them pile up; not a realistic
problem for a decade or so). Any inventory of disks is only realistic
from the discovery angle; just report what's plugged in right now.

I consider individual disks just about as uninteresting as power
supplies. Does that make sense?

Details pending..
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html