Re: grid data placement take 2

Sam Lang <sam.lang@xxxxxxxxxxx> · Sat, 16 Feb 2013 11:04:56 -0600

On Fri, Feb 15, 2013 at 6:18 PM, Dimitri Maziuk <dmaziuk@xxxxxxxxxxxxx> wrote:
> On 02/15/2013 06:09 PM, Sam Lang wrote:
>
>> Your best bet is probably to add some location awareness to your job
>> scheduling so that a job runs on an osd where the file data is
>> located.  You can access the location of a file with:
>>
>> cephfs <file> map
>>
>> If you want an entire 2GB file to end up on the same osd (sounds like
>> you do), you can set the layout of your files (or set the layout of a
>> parent directory and create files in it) with:
>>
>> cephfs <file> set_layout -s $[1024*1024*1024*2]
>>
>> Before doing that though, you might want to test out the performance
>> of a job on a ceph setup with only one osd (or all osds on the same
>> node).  That will potentially tell you if your network is a
>> significant bottleneck.
>
> Will do, thanks, however,
>
> I have 30-odd GB (and growing) of search date split into 2GB files and
> each job reads through all of them. So what I do now with rsync and want
> to replicate with ceph is a full mirror of everything on every host
> (osd). Can I get ceph to do that?

You can, but ceph always performs reads from the primary osd, so you
will still need to use the set_layout and map commands mentioned above
to run your jobs on the right nodes.

>
> (I was trying to get there with pool size = # osds, min_size = # osds
> and crush map with uniform algorithm & equal weight of 1 for each host.)

pool size sets the size of replicas desired, and should be thought of
as the number of replicas that will exist in a clean, steady state
(without constant failures causing degraded pgs).  min_size sets the
size of replicas required for I/O to succeed.

-sam

>
> Thanks again
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com