repeated writes of same file uses all space in an osd?

Joe Allen <Joe.Allen@xxxxxxxxxx> · Tue, 19 Feb 2013 23:56:05 +0000

Hi ceph-users, 

I have a three node cluster (just testing). All the nodes are running kernel 3.2.0-29-generic.

The distribution is Ubuntu 12.04 LTS. I am using ceph version 0.56.2.
I have one disk device on each of the hosts formatted with ext4 and mounted,

and one osd on each of the hosts. I have followed the quick start guide to the best of my ability. The cluster
Health is HEALTH_OK. I create an rbd0 device on a client machine. I moved the journal to another device (by symlinking).
I mount the device; i.e.; 

# rbd create foo --size 4096
# modprobe rbd
# rbd map foo --pool rbd --name client.admin
# mkfs.ext4 -m0 /dev/rbd/rbd/foo
# mount /dev/rbd/rbd/foo /mnt/myrbd

So far so good. 

Now to test various things I run in one window:

ceph –w
2013-02-19 15:40:23.632048 mon.0 [INF] pgmap v604: 768 pgs: 768 active+clean; 197 MB data, 1016 MB used, 13524 MB / 15308 MB avail

(so my first question is why is 1016 MB used? Its not the journal, right?

and in another window, I run: 

while [ 1 ]; do dd if=/dev/zero of=/mnt/myrbd/test bs=4k count=102400; sleep 1 ; done

which simply writes the same file 400Meg file over and over.

During this run I see that the file system is filling up, and soon I am out of space.

The space left shrinks and shrinks:

2013-02-19 15:44:14.803659 mon.0 [INF] pgmap v624: 768 pgs: 768 active+clean; 1700 MB data, 3556 MB used, 10984 MB / 15308 MB avail
…
2013-02-19 15:44:36.955256 mon.0 [INF] pgmap v637: 768 pgs: 768 active+clean; 3067 MB data, 6704 MB used, 7836 MB / 15308 MB avail
…
2013-02-19 15:45:34.821546 mon.0 [INF] pgmap v672: 768 pgs: 768 active+clean; 3849 MB data, 8345 MB used, 6195 MB / 15308 MB avail
…
2013-02-19 15:49:19.940062 mon.0 [INF] pgmap v807: 768 pgs: 768 active+clean; 3849 MB data, 8358 MB used, 6182 MB / 15308 MB avail

At this point my 400MB file (+ overhead) is using like 8 GB out of 15GB?

Is this expected? I don’t think most systems run out of space by repeatedly writing the same file over and over?
If I replicate the data twice, I should expect like 1.2GB of space used?

Is there something that cleans out old, rewritten data ? Am I not giving it enough time to clean up?
If I let it go it eventually runs out of space, and the process can be accelerated a lot by mounting the rdb on more than once host simultaneously.

Any comments/education appreciated. If I’ve left anything out that would be useful, please let me know.

-J. 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com