Re: ceph df: Raw used vs. used vs. actual bytes in cephfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Another space "leak" might be due BlueStore misbehavior that takes DB partition(s) space into account when calculating total store size. And all this space is immediately marked as used even for an empty store. So if you have 3 OSD with 10 Gb DB device each you unconditionally get 30 Gb used space in the report.

Plus additional 1Gb (with default settings) per each OSD as BlueStore unconditionally locks that space at block device for BlueFS usage.

Also it might allocate (and hence report as used) even more space at block device for BlueFS if DB partition isn't enough. You should inspect OSD performance counters under "bluefs" section to check that amount.


Also please note that for 64K allocation
On 2/20/2018 10:33 AM, Flemming Frandsen wrote:
I didn't know about ceph df detail, that's quite useful, thanks.

I was thinking that the problem had to do with some sort of internal fragmentation, because the filesystem in question does have millions (2.9 M or threabouts) of files, however, even if 4k is lost for each file, that only amounts to about 23 GB of raw space lost and I have 3276 GB of raw space unaccounted for.

I've researched the min alloc option a bit and even though no documentation seems to exist, I've found that the default is 64k for hdd, but even if the lost space per file is 64k and that's mirrored, I can only account for 371 GB, so that doesn't really help a great deal.

I have set up an experimental cluster with "bluestore min alloc size = 4096" and so far I've been unable to make it lose space like the first cluster.


I'm very worried that ceph is unusable because of this issue.



On 19/02/18 19:38, Pavel Shub wrote:
Could you be running into block size (minimum allocation unit)
overhead? Default bluestore block size is 4k for hdd and 64k for ssd.
This is exacerbated if you have tons of small files. I tend to see
this when "ceph df detail" sum of raw used in pools is less than the
global raw bytes used.

On Mon, Feb 19, 2018 at 2:09 AM, Flemming Frandsen
<flemming.frandsen@xxxxxxxxxxxxxxxx> wrote:
Each OSD lives on a separate HDD in bluestore with the journals on 2GB
partitions on a shared SSD.


On 16/02/18 21:08, Gregory Farnum wrote:

What does the cluster deployment look like? Usually this happens when you’re sharing disks with the OS, or have co-located file journals or something.
On Fri, Feb 16, 2018 at 4:02 AM Flemming Frandsen
<flemming.frandsen@xxxxxxxxxxxxxxxx> wrote:
I'm trying out cephfs and I'm in the process of copying over some
real-world data to see what happens.

I have created a number of cephfs file systems, the only one I've
started working on is the one called jenkins specifically the one named
jenkins which lives in fs_jenkins_data and fs_jenkins_metadata.

According to ceph df I have about 1387 GB of data in all of the pools,
while the raw used space is 5918 GB, which gives a ratio of about 4.3, I would have expected a ratio around 2 as the pool size has been set to 2.


Can anyone explain where half my space has been squandered?

  > ceph df
GLOBAL:
      SIZE      AVAIL     RAW USED     %RAW USED
      8382G     2463G        5918G         70.61
POOLS:
      NAME                         ID     USED %USED     MAX
AVAIL     OBJECTS
      .rgw.root                    1        1113         0 258G
4
      default.rgw.control          2           0         0 258G
8
      default.rgw.meta             3           0         0 258G
0
      default.rgw.log              4           0         0 258G
207
      fs_docker-nexus_data         5      66120M     11.09 258G
22655
      fs_docker-nexus_metadata     6      39463k         0 258G
2376
      fs_meta_data                 7         330         0 258G
4
      fs_meta_metadata             8        567k         0 258G
22
      fs_jenkins_data              9       1321G     71.84 258G
28576278
      fs_jenkins_metadata          10     52178k         0 258G
2285493
      fs_nexus_data                11          0         0 258G
0
      fs_nexus_metadata            12       4181         0 258G
21

--
   Regards Flemming Frandsen - Stibo Systems - DK - STEP Release Manager
   Please use release@xxxxxxxxx for all Release Management requests

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
  Regards Flemming Frandsen - Stibo Systems - DK - STEP Release Manager
  Please use release@xxxxxxxxx for all Release Management requests


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux