Re: CephFS - Problems with the reported used space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 7, 2015 at 3:41 PM, Goncalo Borges
<goncalo@xxxxxxxxxxxxxxxxxxx> wrote:
> Hi All...
>
> I am still fighting with this issue. It may be something which is not
> properly implemented, and if that is the case, that is fine.
>
> I am still trying to understand what is the real space occupied by files in
> a /cephfs filesystem, reported for example by a df.
>
> Maybe I did not explain myself clearly. I am not saying that block size has
> something to do with rbytes, I was just making a comparison with what I
> expect in a regular POSIX filesystem. Let me put the question in the
> following / different way:
>
> 1) I know that, if I only have one char file in a ext4 filesystem, where my
> filesystem was set with a 4KB blocksize, a df would show 4KB as used space.
>
> 2) now imagine that I only have one char file in my Cephfs filesystem, and
> the layout of my file is object_size=512K, stripe_count=2, and
> stripe_unit=256K. Also assume that I have set my cluster to have 3
> replicates. What would be the used space reported by a df command in this
> case?
>
> My naive assumption would be that a df should show as used space 512KB x 3.
> Is this correct?
>
> No. Used space reported by df is the sum of used space of OSDs' local
> store. A 512k file require 3x512k space for data, OSD and local
> filesystem also need extra space for tracking these data.
>
>
> Please bare with me on my simple minded tests:
>
> 0) # umount /cephfs;mount -t ceph X.X.X.X:6789:/  /cephfs -o
> name=admin,secretfile=/etc/ceph/admin.secret
>
> 1) # getfattr -d -m ceph.* /cephfs/objectsize4M_stripeunit512K_stripecount8/
> (...)
> ceph.dir.layout="stripe_unit=524288 stripe_count=8 object_size=4194304
> pool=cephfs_dt"
> ceph.dir.rbytes="549755813888"
> (...)
>
>
> 2) # df -B 1 /cephfs/
> Filesystem                1B-blocks           Used      Available Use%
> Mounted on
> X.X.X.X:6789:/ 95618814967808 11738728628224 83880086339584  13% /cephfs
>
>
> 3) # dd if=/dev/zero
> of=/cephfs/objectsize4M_stripeunit512K_stripecount8/4096bytes.txt bs=1
> count=4096
> 4096+0 records in
> 4096+0 records out
> 4096 bytes (4.1 kB) copied, 0.0139456 s, 294 kB/s
>
>
> 4) # ls -lb /cephfs/objectsize4M_stripeunit512K_stripecount8/4096bytes.txt
> -rw-r--r-- 1 root root 4096 Aug  7 07:16
> /cephfs/objectsize4M_stripeunit512K_stripecount8/4096bytes.txt
>
>
> 5) # umount /cephfs;mount -t ceph X.X.X.X:6789:/  /cephfs -o
> name=admin,secretfile=/etc/ceph/admin.secret
>
>
> 6) # getfattr -d -m ceph.* /cephfs/objectsize4M_stripeunit512K_stripecount8/
> (...)
> ceph.dir.layout="stripe_unit=524288 stripe_count=8 object_size=4194304
> pool=cephfs_dt"
> ceph.dir.rbytes="549755817984"
>
>
> 7) # df -B 1 /cephfs/
> Filesystem                1B-blocks           Used      Available Use%
> Mounted on
> 192.231.127.8:6789:/ 95618814967808 11738728628224 83880086339584  13%
> /cephfs
>
>
> Please note that in this simple minded tests:
>
>     a./  rbytes  properly reports the change in size (after a unmount/mount)
>             549755817984 - 549755813888 = 4096
>
>     b./ A df does not show any change.

I think df on cephfs and 'ceph df' use the same mechanism to get used
and available spaces. The used space it reports is not updated in real
time. (OSDs report their used space to monitor periodically. The
monitor gathers these informations to get used space of the whole
cluster)

>
> I could use 'ceph df details' but it does not give me the granularity  want.
> Moreover, I also do not understand well its input:
>
> # ceph df
> GLOBAL:
>     SIZE       AVAIL      RAW USED     %RAW USED
>     89051G     78119G       10932G         12.28
> POOLS:
>     NAME          ID     USED      %USED     MAX AVAIL     OBJECTS
>     cephfs_dt     5      3633G      4.08        25128G     1554050
>     cephfs_mt    6      3455k         0        25128G          39
>
> - What imposes the MAX AVAILABLE? I am assuming it is ~ GLOBAL AVAIL /
> Number of replicas...
> - The %USED is computed in reference to what? I am asking because it seems
> it is computed in references to GLOBAL SIZE...  But this is misleading since
> the POOL MAX AVAIL is much less.

both used and available spaces are computed in references to global
size. yes, it's misleading. we will improve it later


>
> Thanks for the clarifications
> Goncalo
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux