Re: How to tell a VM to write more local ceph nodes than to the network.

Gregory Farnum <greg@xxxxxxxxxxx> · Fri, 16 Jan 2015 07:15:21 -0800

On Fri, Jan 16, 2015 at 2:52 AM, Roland Giesler <roland@xxxxxxxxxxxxxx> wrote:
> On 14 January 2015 at 21:46, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>
>> On Tue, Jan 13, 2015 at 1:03 PM, Roland Giesler <roland@xxxxxxxxxxxxxx>
>> wrote:
>> > I have a 4 node ceph cluster, but the disks are not equally distributed
>> > across all machines (they are substantially different from each other)
>> >
>> > One machine has 12 x 1TB SAS drives (h1), another has 8 x 300GB SAS (s3)
>> > and
>> > two machines have only two 1 TB drives each (s2 & s1).
>> >
>> > Now machine s3 has by far the most CPU's and RAM, so I'm running my VM's
>> > mostly from there, but I want to make sure that the writes that happen
>> > to
>> > the ceph cluster get written to the "local" osd's on s3 first and then
>> > the
>> > additional writes/copies get done to the network.
>> >
>> > Is this possible with ceph.  The VM's are KVM in Proxmox in case it's
>> > relevant.
>>
>> In general you can't set up Ceph to write to the local node first. In
>> some specific cases you can if you're willing to do a lot more work
>> around placement, and this *might* be one of those cases.
>>
>> To do this, you'd need to change the CRUSH rules pretty extensively,
>> so that instead of selecting OSDs at random, they have two steps:
>> 1) starting from bucket s3, select a random OSD and put it at the
>> front of the OSD list for the PG.
>> 2) Starting from a bucket which contains all the other OSDs, select
>> N-1 more at random (where N is the number of desired replicas).
>
>
> I understand in principle what you're saying.  Let me go back a step and ask
> the question somewhat differently then:
>
> I have set up 4 machines in a cluster.  When I created the Windows 2008
> server VM on S1 (I corrected my first email: I have three Sunfire X series
> servers, S1, S2, S3) since S1 has 36GB of RAM en 8 x 300GB SAS drives, it
> was running normally, pretty close to what I had on the bare metal.  About a
> month later (after being on leave to 2 weeks), I found a machine that is
> crawling at a snail pace and I cannot figure out why.

You mean one of the VMs has very slow disk access? Or one of the hosts
is very slow?

In any case, you'd need to look at what about that system is different
from the others and poke at that difference until it exposes an issue,
I suppose.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com