Re: Problem with CephFS - No space left on device

Yoann Moulin <yoann.moulin@xxxxxxx> · Tue, 8 Jan 2019 16:05:34 +0100

> root@pf-us1-dfs3:/home/rodrigo# ceph osd crush rule dump
> [
>    {
>        "rule_id": 0,
>        "rule_name": "replicated_rule",
>        "ruleset": 0,
>        "type": 1,
>        "min_size": 1,
>        "max_size": 10,
>        "steps": [
>            {
>                "op": "take",
>                "item": -1,
>                "item_name": "default"
>            },
>            {
>                "op": "chooseleaf_firstn",
>                "num": 0,
>                "type": "host"
>            }

This means the failure domain is set to "host", the cluster will try to balance objects between "hosts" to be able to lose one host and be able
to keep data online.

You can change this to "disk" but in that case, your cluster will tolerate the failure of one disk but you won't be able to lose one server, you
won't have the warranty that all replica of an object will be on different hosts.

The best thing you can do here is added two disks to pf-us1-dfs3.

The second one would be, moving one disk from one of the 2 other servers to pf-us1-dfs3 if you can't quickly get  new disks. I don't know what
is the best way to do that, I never had this case on my cluster.

Best regards,

Yoann

> On Tue, Jan 8, 2019 at 11:35 AM Yoann Moulin <yoann.moulin@xxxxxxx <mailto:yoann.moulin@xxxxxxx>> wrote:
> 
>     Hello,
> 
>     > Hi Yoann, thanks for your response.
>     > Here are the results of the commands.
>     >
>     > root@pf-us1-dfs2:/var/log/ceph# ceph osd df
>     > ID CLASS WEIGHT  REWEIGHT SIZE    USE     AVAIL   %USE  VAR  PGS  
>     > 0   hdd 7.27739  1.00000 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310  
>     > 5   hdd 7.27739  1.00000 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271  
>     > 6   hdd 7.27739  1.00000 7.3 TiB 609 GiB 6.7 TiB  8.17 0.15  49  
>     > 8   hdd 7.27739  1.00000 7.3 TiB 2.5 GiB 7.3 TiB  0.03    0  42  
>     > 1   hdd 7.27739  1.00000 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285  
>     > 3   hdd 7.27739  1.00000 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296  
>     > 7   hdd 7.27739  1.00000 7.3 TiB 360 GiB 6.9 TiB  4.84 0.09  53  
>     > 9   hdd 7.27739  1.00000 7.3 TiB 4.1 GiB 7.3 TiB  0.06 0.00  38  
>     > 2   hdd 7.27739  1.00000 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321  
>     > 4   hdd 7.27739  1.00000 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351  
>     >                    TOTAL  73 TiB  39 TiB  34 TiB 53.13           
>     > MIN/MAX VAR: 0/1.79  STDDEV: 41.15
> 
>     It looks like you don't have a good balance between your OSD, what is your failure domain ?
> 
>     could you provide your crush map http://docs.ceph.com/docs/luminous/rados/operations/crush-map/
> 
>     ceph osd crush tree
>     ceph osd crush rule ls
>     ceph osd crush rule dump
> 
> 
>     > root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail
>     > pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 471 fla
>     > gs hashpspool,full stripe_width 0
>     > pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 471 lf
>     > or 0/439 flags hashpspool,full stripe_width 0 application cephfs
>     > pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 47
>     > 1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs
>     > pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha
>     > shpspool,full stripe_width 0 application rgw
>     > pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 47
>     > 1 flags hashpspool,full stripe_width 0 application rgw
>     > pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f
>     > lags hashpspool,full stripe_width 0 application rgw
>     > pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl
>     > ags hashpspool,full stripe_width 0 application rgw
> 
>     You may need to increase the pg num for cephfs_data pool. But before, you must understand what is the impact https://ceph.com/pgcalc/
>     you can't decrease pg_num, if it set too high you may have trouble in your cluster.
> 
>     > root@pf-us1-dfs2:/var/log/ceph# ceph osd tree
>     > ID CLASS WEIGHT   TYPE NAME            STATUS REWEIGHT PRI-AFF  
>     > -1       72.77390 root default                                  
>     > -3       29.10956     host pf-us1-dfs1                          
>     > 0   hdd  7.27739         osd.0            up  1.00000 1.00000  
>     > 5   hdd  7.27739         osd.5            up  1.00000 1.00000  
>     > 6   hdd  7.27739         osd.6            up  1.00000 1.00000  
>     > 8   hdd  7.27739         osd.8            up  1.00000 1.00000  
>     > -5       29.10956     host pf-us1-dfs2                          
>     > 1   hdd  7.27739         osd.1            up  1.00000 1.00000  
>     > 3   hdd  7.27739         osd.3            up  1.00000 1.00000  
>     > 7   hdd  7.27739         osd.7            up  1.00000 1.00000  
>     > 9   hdd  7.27739         osd.9            up  1.00000 1.00000  
>     > -7       14.55478     host pf-us1-dfs3                          
>     > 2   hdd  7.27739         osd.2            up  1.00000 1.00000  
>     > 4   hdd  7.27739         osd.4            up  1.00000 1.00000
> 
>     You really should add 2 disks to pf-us1-dfs3, currently, the cluster tries to balance data between the 3 hosts, (replica 3, failure domain
>     set to
>     'host' I guess). Each host will store 1/3 of data (1 replica) pf-us1-dfs3  only have half of the 2 others, you won't be able to put more than 3x
>     (osd.2+osd.4) even though there are free spaces on others OSDs.
> 
>     Best regards,
> 
>     Yoann
> 
>     > On Tue, Jan 8, 2019 at 10:36 AM Yoann Moulin <yoann.moulin@xxxxxxx <mailto:yoann.moulin@xxxxxxx> <mailto:yoann.moulin@xxxxxxx
>     <mailto:yoann.moulin@xxxxxxx>>> wrote:
>     >
>     >     Hello,
>     >
>     >     > Hi guys, I need your help.
>     >     > I'm new with Cephfs and we started using it as file storage.
>     >     > Today we are getting no space left on device but I'm seeing that we have plenty space on the filesystem.
>     >     > Filesystem              Size  Used Avail Use% Mounted on
>     >     > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts   73T   39T   35T  54% /mnt/cephfs
>     >     >
>     >     > We have 35TB of disk space. I've added 2 additional OSD disks with 7TB each but I'm getting the error "No space left on device"
>     every time
>     >     that
>     >     > I want to add a new file.
>     >     > After adding the 2 additional OSD disks I'm seeing that the load is beign distributed among the cluster.
>     >     > Please I need your help.
>     >
>     >     Could you give us the output of
>     >
>     >     ceph osd df
>     >     ceph osd pool ls detail
>     >     ceph osd tree
>     >
>     >     Best regards,
>     >
>     >     --
>     >     Yoann Moulin
>     >     EPFL IC-IT
>     >     _______________________________________________
>     >     ceph-users mailing list
>     >     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> <mailto:ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>>
>     >     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>     >
> 
> 
>     -- 
>     Yoann Moulin
>     EPFL IC-IT
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Yoann Moulin
EPFL IC-IT
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com