Listing directories while writing on same directoy - reading operations very slow.

"Jose V. Carrion" <burcarjo@xxxxxxxxx> · Sat, 7 Sep 2019 15:37:31 +0200

I installed a small ceph setup in order to test ceph before to be installed a more big production setup.The setup is composed by 3 OSD nodes (with 2 osd per node) + 1 Mon daemon + 1 MDS daemon. The monitor agent is running on a single OSD node and MDS service running on the admin node.
All nodes with:
Ceph 13.2.6-0.el7.x86_64
CentOS Linux release 7.6.1810 (Core)

Testing the setup I had 3 troubles:
1. Listing directories is very slow when I Writing a multiple files on the same directory. 
[cephuser@xxxx ~]$ time ls /mnt/cephfs/dir2/
dir1  fileD
real	0m35.903s
user	0m0.000s
sys	0m0.002s

the log shows:
2019-09-07 15:11:52.019 7fbae9521700  0 log_channel(cluster) log [WRN] : 2 slow requests, 2 included below; oldest blocked for > 33.475100 secs
2019-09-07 15:11:52.019 7fbae9521700  0 log_channel(cluster) log [WRN] : slow request 33.475099 seconds old, received at 2019-09-07 15:11:18.544642: client_request(client.4394:4818 getattr pAsLsXsFs #0x10000000e65 2019-09-07 15:11:18.543367 caller_uid=1000, caller_gid=1000{}) currently failed to rdlock, waiting
2019-09-07 15:11:52.019 7fbae9521700  0 log_channel(cluster) log [WRN] : slow request 30.370497 seconds old, received at 2019-09-07 15:11:21.649244: client_request(client.4394:4819 getattr pAsLsXsFs #0x10000000e65 2019-09-07 15:11:21.648417 caller_uid=1000, caller_gid=1000{}) currently failed to rdlock, waiting
2019-09-07 15:11:53.107 7fbaebfaf700  1 mds.stor1demo Updating MDS map to version 520 from mon.0
2019-09-07 15:11:57.019 7fbae9521700  0 log_channel(cluster) log [WRN] : 2 slow requests, 0 included below; oldest blocked for > 38.475154 secs
2019-09-07 15:12:05.115 7fbaebfaf700  1 mds.stor1demo Updating MDS map to version 521 from mon.0
~

However from the same client node that is writing the time :
[cephuser@storage3demo ~]$ time ls /mnt/cephfs/dir2/
dir1  fileD
real	0m0.003s
user	0m0.000s
sys	0m0.003s

2. Writing files the bandwidth is around 110 MB/s but reading files only takes about 50-60 MB/s. Why this behavior?.

3 I did two reweigh-by-utilization . After waiting some hours ceph finished the operation but now it shows  "1/2083316 objects misplaced (0.000%)" all the time. How could I fix it ?.

Information about my ceph system:
  cluster:
    id:     ad3f4a27-bceb-4635-acd9-691b792661af
    health: HEALTH_WARN
            1/1980216 objects misplaced (0.000%)

  services:
    mon: 1 daemons, quorum storage1demo
    mgr: storage1demo(active), standbys: storage2demo, storage3demo
    mds: cephfs-1/1/1 up  {0=stor1demo=up:active}
    osd: 6 osds: 6 up, 6 in; 2 remapped pgs

  data:
    pools:   2 pools, 256 pgs
    objects: 990.1 k objects, 3.7 TiB
    usage:   7.4 TiB used, 2.1 TiB / 9.6 TiB avail
    pgs:     1/1980216 objects misplaced (0.000%)
             254 active+clean
             2   active+clean+remapped

  io:
    client:   85 B/s wr, 0 op/s rd, 238 op/s wr

+----+--------------+-------+-------+--------+---------+--------+---------+-----------+
| id |     host     |  used | avail | wr ops | wr data | rd ops | rd data |   state   |
+----+--------------+-------+-------+--------+---------+--------+---------+-----------+
| 0  | storage1demo | 1522G |  340G |   29   |     0   |    0   |     0   | exists,up |
| 1  | storage1demo |  353G |  112G |   12   |     0   |    0   |     0   | exists,up |
| 2  | storage2demo | 1524G |  338G |   47   |     0   |    0   |     0   | exists,up |
| 3  | storage2demo | 1201G |  661G |   38   |     0   |    0   |     0   | exists,up |
| 4  | storage3demo | 1521G |  341G |   45   |     0   |    0   |     0   | exists,up |
| 5  | storage3demo | 1378G |  484G |   35   |     0   |    0   |     0   | exists,up |
+----+--------------+-------+-------+--------+---------+--------+---------+-----------+

Thanks in advance.  

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com