On Wed, May 14, 2014 at 12:12 PM, Shesha Sreenivasamurthy <sheshas at gmail.com> wrote: > Hi, > I was experimenting with Ceph and found an interesting behavior (at > least to me) : Number of objects doubled when a new placement group was > added. > > Experiment Set Up: > > 3 Nodes with one OSD per node > Replication = 1 > > ceph osd pool create $poolName 1; > ceph osd pool set $poolName size 1; > > Set number of PG=30 > > ceph osd pool set $poolName pg_num 30; > ceph osd pool set $poolName pgp_num 30 > > Start creating objects of 1000 Bytes for a period of 120 seconds using rados > bench with 1 thread > > rados -p $poolName -b 1000 -t 1 bench 120 write &> bench.out > > While the creation is going on gather df statistics every second > > rados df -p $poolName &> df.out > > After 75 seconds add a new placement group > > ceph osd pool set $poolName pg_num 31; > ceph osd pool set $poolName pgp_num 31; > > Plot the number of objects and data size from the above df command. > > I was wondering why the number of object count doubled when we add an new > placement group. It's an accounting artifact. When you split PGs, each of the "child" PGs will report the parents' statistics until it does a scrub and knows how much data is actually present. I believe this is changed in Firefly, so it splits up the data proportionally between them. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com