Re: ceph distributed osd

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Wed, 19 Aug 2015 12:45:41 -0600

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

By default, all pools will use all OSDs. Each RBD, for instance, is
broken up into 4 MB objects and those objects are somewhat uniformly
distributed between the OSDs. When you add another OSD, the CRUSH map
is recalculated and the OSDs shuffle the objects to their new
locations somewhat uniformly distributing them across all available
OSDs.

I say uniformly distributed because it is based on the hashing
algorithm of the name and size is not taken into account. So you may
have more larger objects on some OSDs than others. The number of PGs
affect the ability to more uniformly distribute the data (more hash
buckets for data to land in).

You can create CRUSH rules that limit selection of OSDs to a subset
and then configure a pool to use those rules. This is a pretty
advanced configuration option.

I hope that helps with your question.
-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v1.0.0
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJV1M7SCRDmVDuy+mK58QAASbYQAMG0oPEu56Uz0/9cb4LY
E7QTeX2hUGRX5c65Zurr9p+/Sc4WCvDEZm/aPPcB9UtO0O5dvWXULWjXRgr0
Z13/28OozLxWQihRc80OhY2MskNfgPA0zYwaANgUR0xJV4YFQ1ORa13rj0L8
SL4z/IDK9tK/NDLxnjq/iMPXCTTcg3ufiB+0Njl3zLRbGEOAix6H5hzi0239
qHb7UniTtailICcSI0byQE2vKPWQbJ7GueECbcAn/MkqU0uZqzyh5HotiBFq
9ut/ui3ec0Sg/3puD6TOhipQlP998sMnAa5hFi+hoNbVbljGZ9dGZ+inVlJy
kSQTbNDs0Xo2QijGH11LrQ4yL47Trr2WkIriHONtvbncgZg3qK7uR39k6kZ9
dfGUdtstkn8sh5gt98jFNvjWL8UTH9puAJv5C9TzPuq+cq3kr3dwhy4WxrN+
MNISYwJOvncY/2kl03FLL/Z0HxDx1mjjJMQdzM+q9+D0m/EYfUpe/DxMqqMI
4t8hD5UPBhkv1sgLYSWyJ5vxLnNOZP7roe2Jp0KwwlSADM9DJb4MEx/1nNcb
6emts8KUhhtb1jsH8gu9Z0tzHcaqNE8N1z9JiveaNCjs6wTp8xbtmDB7p9k4
uZzzoIXTJWrIN/Qqukza+/+8D+WAJ618uwXCCpWi/k83RKt7iy2iv5w4EDTx
25cQ
=a+24
-----END PGP SIGNATURE-----
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

On Tue, Aug 18, 2015 at 8:26 AM, gjprabu <gjprabu@xxxxxxxxxxxx> wrote:
> Hi Luis,
>
>         What i mean , we have three OSD with Harddisk size each 1TB and two
> pool (poolA and poolB) with replica 2. Here writing behavior is the
> confusion for us. Our assumptions is below.
>
> PoolA   -- may write with OSD1 and OSD2  (is this correct)
>
> PoolB  --  may write with OSD3 and OSD1 (is this correct)
>
> suppose the hard disk size got full , then how many OSD's need to be added
> and How will be the writing behavior to new OSD's
>
> After added few osd's
>
> PoolA --  may write with OSD4 and OSD5 (is this correct)
> PoolB --  may write with OSD5 and OSD6 (is this correct)
>
>
> Regards
> Prabu
>
> ---- On Mon, 17 Aug 2015 19:41:53 +0530 Luis Periquito <periquito@xxxxxxxxx>
> wrote ----
>
> I don't understand your question? You created a 1G RBD/disk and it's full.
> You are able to grow it though - but that's a Linux management issue, not
> ceph.
>
> As everything is thin-provisioned you can create a RBD with an arbitrary
> size - I've create one with 1PB when the cluster only had 600G/Raw
> available.
>
> On Mon, Aug 17, 2015 at 1:18 PM, gjprabu <gjprabu@xxxxxxxxxxxx> wrote:
>
> Hi All,
>
>            Anybody can help on this issue.
>
> Regards
> Prabu
>
> ---- On Mon, 17 Aug 2015 12:08:28 +0530 gjprabu <gjprabu@xxxxxxxxxxxx> wrote
> ----
>
> Hi All,
>
>                Also please find osd information.
>
> ceph osd dump | grep 'replicated size'
> pool 2 'repo' replicated size 2 min_size 2 crush_ruleset 0 object_hash
> rjenkins pg_num 126 pgp_num 126 last_change 21573 flags hashpspool
> stripe_width 0
>
> Regards
> Prabu
>
>
>
>
> ---- On Mon, 17 Aug 2015 11:58:55 +0530 gjprabu <gjprabu@xxxxxxxxxxxx> wrote
> ----
>
>
>
> Hi All,
>
>    We need to test three OSD and one image with replica 2(size 1GB). While
> testing data is not writing above 1GB. Is there any option to write on third
> OSD.
>
> ceph osd pool get  repo  pg_num
> pg_num: 126
>
> # rbd showmapped
> id pool image          snap device
> 0  rbd  integdownloads -    /dev/rbd0 -- Already one
> 2  repo integrepotest  -    /dev/rbd2  -- newly created
>
>
> [root@hm2 repository]# df -Th
> Filesystem           Type      Size  Used Avail Use% Mounted on
> /dev/sda5            ext4      289G   18G  257G   7% /
> devtmpfs             devtmpfs  252G     0  252G   0% /dev
> tmpfs                tmpfs     252G     0  252G   0% /dev/shm
> tmpfs                tmpfs     252G  538M  252G   1% /run
> tmpfs                tmpfs     252G     0  252G   0% /sys/fs/cgroup
> /dev/sda2            ext4      488M  212M  241M  47% /boot
> /dev/sda4            ext4      1.9T   20G  1.8T   2% /var
> /dev/mapper/vg0-zoho ext4      8.6T  1.7T  6.5T  21% /zoho
> /dev/rbd0            ocfs2     977G  101G  877G  11% /zoho/build/downloads
> /dev/rbd2            ocfs2    1000M 1000M     0 100% /zoho/build/repository
>
> @:~$ scp -r sample.txt root@integ-hm2:/zoho/build/repository/
> root@integ-hm2's password:
> sample.txt
> 100% 1024MB   4.5MB/s   03:48
> scp: /zoho/build/repository//sample.txt: No space left on device
>
> Regards
> Prabu
>
>
>
>
> ---- On Thu, 13 Aug 2015 19:42:11 +0530 gjprabu <gjprabu@xxxxxxxxxxxx> wrote
> ----
>
>
>
> Dear Team,
>
>          We are using two ceph OSD with replica 2 and it is working
> properly. Here my doubt is (Pool A -image size will be 10GB) and its
> replicated with two OSD, what will happen suppose if the size reached the
> limit, Is there any chance to make the data to continue writing in another
> two OSD's.
>
> Regards
> Prabu
>
>
>
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com