Given what you've shown here, it's probably one of the odder cases CephFS is subject to, rather than an actual "there's no disk space" error. How far is the script actually getting? Is it possible your client doesn't have permission to write to the RADOS pool and isn't finding that out until too late?
...actually, hrm, is your cluster running a consistent version? I see the MDSes are different point releases, and if you've got some Jewel daemons in the mix, there are a bunch more cases that could be applying under the old behavior.
-Greg
On Wed, May 30, 2018 at 5:36 AM Doug Bell <db@xxxxxxxxxxxxxxxxxxx> wrote:
I am new to Ceph and have built a small Ceph instance on 3 servers. I realize the configuration is probably not ideal but I’d like to understand an error I’m getting.
Ceph hosts are cm1, cm2, cm3. Cephfs is mounted with ceph.fuse on a server c1. I am attempting to perform a simple cp-rp from one directory tree already in cephfs to another directory also inside of cephfs. The directory tree is 2740 files totaling 93G. Approximately 3/4 of the way through the copy, the following error occurs: "cp: failed to close ‘<filename>': No space left on device” The odd thing is that it seems to finish the copy, as the final directory sizes are the same. But scripts attached to the process see an error so it is causing a problem.
Any idea what is happening? I have watched all of the ceph logs on one of the ceph servers and haven’t seen anything.
Here is some of the configuration. The names actually aren’t obfuscated, they really are that generic. IP Addresses are altered though.
# ceph fs ls
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
# ceph status
cluster:
id: c14e77f1-9898-48d8-8a52-cd1f1c5bf689
health: HEALTH_WARN
1 MDSs behind on trimming
services:
mon: 3 daemons, quorum cm1,cm3,cm2
mgr: cm3(active), standbys: cm2, cm1
mds: cephfs-1/1/1 up {0=cm1=up:active}, 1 up:standby-replay, 1 up:standby
osd: 7 osds: 7 up, 7 in
data:
pools: 2 pools, 256 pgs
objects: 377k objects, 401 GB
usage: 1228 GB used, 902 GB / 2131 GB avail
pgs: 256 active+clean
io:
client: 852 B/s rd, 2 op/s rd, 0 op/s wr
# ceph osd status
+----+------+-------+-------+--------+---------+--------+---------+-----------+
| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
+----+------+-------+-------+--------+---------+--------+---------+-----------+
| 0 | cm1 | 134G | 165G | 0 | 0 | 0 | 0 | exists,up |
| 1 | cm1 | 121G | 178G | 0 | 0 | 0 | 0 | exists,up |
| 2 | cm2 | 201G | 98.3G | 0 | 0 | 1 | 90 | exists,up |
| 3 | cm2 | 207G | 92.1G | 0 | 0 | 0 | 0 | exists,up |
| 4 | cm3 | 217G | 82.8G | 0 | 0 | 0 | 0 | exists,up |
| 5 | cm3 | 192G | 107G | 0 | 0 | 0 | 0 | exists,up |
| 6 | cm1 | 153G | 177G | 0 | 0 | 1 | 16 | exists,up |
+----+------+-------+-------+--------+---------+--------+---------+—————+
# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 ssd 0.29300 1.00000 299G 134G 165G 44.74 0.78 79
1 ssd 0.29300 1.00000 299G 121G 178G 40.64 0.70 75
6 ssd 0.32370 1.00000 331G 153G 177G 46.36 0.80 102
2 ssd 0.29300 1.00000 299G 201G 100754M 67.20 1.17 129
3 ssd 0.29300 1.00000 299G 207G 94366M 69.28 1.20 127
4 ssd 0.29300 1.00000 299G 217G 84810M 72.39 1.26 131
5 ssd 0.29300 1.00000 299G 192G 107G 64.15 1.11 125
TOTAL 2131G 1228G 902G 57.65
MIN/MAX VAR: 0.70/1.26 STDDEV: 12.36
# ceph fs get cephfs
Filesystem 'cephfs' (1)
fs_name cephfs
epoch 1047
flags c
created 2018-03-20 13:58:51.860813
modified 2018-03-20 13:58:51.860813
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
last_failure 0
last_failure_osd_epoch 98
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2}
max_mds 1
in 0
up {0=74127}
failed
damaged
stopped
data_pools [1]
metadata_pool 2
inline_data disabled
balancer
standby_count_wanted 1
74127: 10.1.2.157:6800/3141645279 'cm1' mds.0.36 up:active seq 5 (standby for rank 0)
64318: 10.1.2.194:6803/2623342769 'cm2' mds.0.0 up:standby-replay seq 497658 (standby for rank 0)
# ceph fs status
cephfs - 9 clients
======
+------+----------------+-----+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+----------------+-----+---------------+-------+-------+
| 0 | active | cm1 | Reqs: 0 /s | 295k | 292k |
| 0-s | standby-replay | cm2 | Evts: 0 /s | 0 | 0 |
+------+----------------+-----+---------------+-------+-------+
+-----------------+----------+-------+-------+
| Pool | type | used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 167M | 160G |
| cephfs_data | data | 401G | 160G |
+-----------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
| cm3 |
+-------------+
+----------------------------------------------------------------------------------+---------+
| version | daemons |
+----------------------------------------------------------------------------------+---------+
| ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable) | cm1 |
| ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable) | cm3 |
+----------------------------------------------------------------------------------+---------+
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com