I installed a small ceph setup in order to test ceph before to be installed a more big production setup.
dir1 fileD
real 0m35.903s
user 0m0.000s
sys 0m0.002s
id: ad3f4a27-bceb-4635-acd9-691b792661af
health: HEALTH_WARN
1/1980216 objects misplaced (0.000%)
services:
mon: 1 daemons, quorum storage1demo
mgr: storage1demo(active), standbys: storage2demo, storage3demo
mds: cephfs-1/1/1 up {0=stor1demo=up:active}
osd: 6 osds: 6 up, 6 in; 2 remapped pgs
data:
pools: 2 pools, 256 pgs
objects: 990.1 k objects, 3.7 TiB
usage: 7.4 TiB used, 2.1 TiB / 9.6 TiB avail
pgs: 1/1980216 objects misplaced (0.000%)
254 active+clean
2 active+clean+remapped
io:
The setup is composed by 3 OSD nodes (with 2 osd per node) + 1 Mon daemon + 1 MDS daemon. The monitor agent is running on a single OSD node and MDS service running on the admin node.
All nodes with:
Ceph 13.2.6-0.el7.x86_64
CentOS Linux release 7.6.1810 (Core)
Testing the setup I had 3 troubles:
1. Listing directories is very slow when I Writing a multiple files on the same directory.
[cephuser@xxxx ~]$ time ls /mnt/cephfs/dir2/dir1 fileD
real 0m35.903s
user 0m0.000s
sys 0m0.002s
the log shows:
2019-09-07 15:11:52.019 7fbae9521700 0 log_channel(cluster) log [WRN] : 2 slow requests, 2 included below; oldest blocked for > 33.475100 secs
2019-09-07 15:11:52.019 7fbae9521700 0 log_channel(cluster) log [WRN] : slow request 33.475099 seconds old, received at 2019-09-07 15:11:18.544642: client_request(client.4394:4818 getattr pAsLsXsFs #0x10000000e65 2019-09-07 15:11:18.543367 caller_uid=1000, caller_gid=1000{}) currently failed to rdlock, waiting
2019-09-07 15:11:52.019 7fbae9521700 0 log_channel(cluster) log [WRN] : slow request 30.370497 seconds old, received at 2019-09-07 15:11:21.649244: client_request(client.4394:4819 getattr pAsLsXsFs #0x10000000e65 2019-09-07 15:11:21.648417 caller_uid=1000, caller_gid=1000{}) currently failed to rdlock, waiting
2019-09-07 15:11:53.107 7fbaebfaf700 1 mds.stor1demo Updating MDS map to version 520 from mon.0
2019-09-07 15:11:57.019 7fbae9521700 0 log_channel(cluster) log [WRN] : 2 slow requests, 0 included below; oldest blocked for > 38.475154 secs
2019-09-07 15:12:05.115 7fbaebfaf700 1 mds.stor1demo Updating MDS map to version 521 from mon.0
~
2019-09-07 15:11:52.019 7fbae9521700 0 log_channel(cluster) log [WRN] : slow request 33.475099 seconds old, received at 2019-09-07 15:11:18.544642: client_request(client.4394:4818 getattr pAsLsXsFs #0x10000000e65 2019-09-07 15:11:18.543367 caller_uid=1000, caller_gid=1000{}) currently failed to rdlock, waiting
2019-09-07 15:11:52.019 7fbae9521700 0 log_channel(cluster) log [WRN] : slow request 30.370497 seconds old, received at 2019-09-07 15:11:21.649244: client_request(client.4394:4819 getattr pAsLsXsFs #0x10000000e65 2019-09-07 15:11:21.648417 caller_uid=1000, caller_gid=1000{}) currently failed to rdlock, waiting
2019-09-07 15:11:53.107 7fbaebfaf700 1 mds.stor1demo Updating MDS map to version 520 from mon.0
2019-09-07 15:11:57.019 7fbae9521700 0 log_channel(cluster) log [WRN] : 2 slow requests, 0 included below; oldest blocked for > 38.475154 secs
2019-09-07 15:12:05.115 7fbaebfaf700 1 mds.stor1demo Updating MDS map to version 521 from mon.0
~
However from the same client node that is writing the time :
[cephuser@storage3demo ~]$ time ls /mnt/cephfs/dir2/
dir1 fileD
real 0m0.003s
user 0m0.000s
sys 0m0.003s
dir1 fileD
real 0m0.003s
user 0m0.000s
sys 0m0.003s
2. Writing files the bandwidth is around 110 MB/s but reading files only takes about 50-60 MB/s. Why this behavior?.
3 I did two reweigh-by-utilization . After waiting some hours ceph finished the operation but now it shows "1/2083316 objects misplaced (0.000%)" all the time. How could I fix it ?.
Information about my ceph system:
cluster:id: ad3f4a27-bceb-4635-acd9-691b792661af
health: HEALTH_WARN
1/1980216 objects misplaced (0.000%)
services:
mon: 1 daemons, quorum storage1demo
mgr: storage1demo(active), standbys: storage2demo, storage3demo
mds: cephfs-1/1/1 up {0=stor1demo=up:active}
osd: 6 osds: 6 up, 6 in; 2 remapped pgs
data:
pools: 2 pools, 256 pgs
objects: 990.1 k objects, 3.7 TiB
usage: 7.4 TiB used, 2.1 TiB / 9.6 TiB avail
pgs: 1/1980216 objects misplaced (0.000%)
254 active+clean
2 active+clean+remapped
io:
client: 85 B/s wr, 0 op/s rd, 238 op/s wr
+----+--------------+-------+-------+--------+---------+--------+---------+-----------+
| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
+----+--------------+-------+-------+--------+---------+--------+---------+-----------+
| 0 | storage1demo | 1522G | 340G | 29 | 0 | 0 | 0 | exists,up |
| 1 | storage1demo | 353G | 112G | 12 | 0 | 0 | 0 | exists,up |
| 2 | storage2demo | 1524G | 338G | 47 | 0 | 0 | 0 | exists,up |
| 3 | storage2demo | 1201G | 661G | 38 | 0 | 0 | 0 | exists,up |
| 4 | storage3demo | 1521G | 341G | 45 | 0 | 0 | 0 | exists,up |
| 5 | storage3demo | 1378G | 484G | 35 | 0 | 0 | 0 | exists,up |
+----+--------------+-------+-------+--------+---------+--------+---------+-----------+
| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
+----+--------------+-------+-------+--------+---------+--------+---------+-----------+
| 0 | storage1demo | 1522G | 340G | 29 | 0 | 0 | 0 | exists,up |
| 1 | storage1demo | 353G | 112G | 12 | 0 | 0 | 0 | exists,up |
| 2 | storage2demo | 1524G | 338G | 47 | 0 | 0 | 0 | exists,up |
| 3 | storage2demo | 1201G | 661G | 38 | 0 | 0 | 0 | exists,up |
| 4 | storage3demo | 1521G | 341G | 45 | 0 | 0 | 0 | exists,up |
| 5 | storage3demo | 1378G | 484G | 35 | 0 | 0 | 0 | exists,up |
+----+--------------+-------+-------+--------+---------+--------+---------+-----------+
Thanks in advance.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com