Re: long blocking with writes on rbds

LOPEZ Jean-Charles <jelopez@xxxxxxxxxx> · Thu, 9 Apr 2015 21:24:59 -0700

Hi Jeff,

have you tried gathering an iostats but on the OSD side to see how your OSD drives behave.

The RBD side shows you what the client is experiencing (the symptom) but will not help you find the problem.

Can you grab this iostat output on the OSD VMs (district-1 or district-2) depending on which test you did last. Don’t forget to indicate which devices are the OSD devices on your VMs together with the iostat posting.

Have you also investigated the network between your client and the OSDs? While the test is going, do you see any unusal message in a « ceph -w » output?

Pastebin and we’ll see if we can spot something.

As for the too few PGs, once we’ve found the root cause of why it’s slow, you’ll be able to adjust and increase the number of PGs per pool.

Cheers
JC

> On 9 Apr 2015, at 20:25, Jeff Epstein <jeff.epstein@xxxxxxxxxxxxxxxx> wrote:
> 
> As a follow-up to this issue, I'd like to point out some other things I've noticed.
> 
> First, per suggestions posted here, I've reduced the number of pgs per pool. This results in the following ceph status:
> 
>     cluster e96e10d3-ad2b-467f-9fe4-ab5269b70206
>      health HEALTH_WARN too few pgs per osd (14 < min 20)
>      monmap e1: 3 mons at {a=192.168.224.4:6789/0,b=192.168.232.4:6789/0,c=192.168.240.4:6789/0}, election epoch 8, quorum 0,1,2 a,b,c
>      osdmap e238: 6 osds: 6 up, 6 in
>       pgmap v1107: 86 pgs, 23 pools, 2511 MB data, 801 objects
>             38288 MB used, 1467 GB / 1504 GB avail
>                   86 active+clean
> 
> I'm not sure if I should be concerned about the HEALTH WARN.
> 
> However, this has not helped the performance issues. I've dug deeper to try to understand what is actually happening. It's curious because there isn't much data: our pools are about 5GB, so it really shouldn't take 30 minutes to an hour to run mkfs. Here some results taken from disk analysis tools while this delay is in progress:
> 
> From pt-diskstats:
> 
>   #ts device    rd_s rd_avkb rd_mb_s rd_mrg rd_cnc   rd_rt    wr_s wr_avkb wr_mb_s wr_mrg wr_cnc   wr_rt busy in_prg    io_s  qtime stime
>   1.0 rbd0       0.0     0.0     0.0     0%    0.0     0.0     0.0     0.0     0.0     0%    0.0     0.0 100%      6     0.0    0.0   0.0
> 
> From iostat:
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> rbd0              0.00     0.03    0.03    0.04     0.13    10.73   310.78     3.31 19730.41    0.40 37704.35 7073.59  49.47
> 
> These results correspond with my experience: the device is busy, as witnessed by the "busy" column in pt-diskstats and the "await" column in iostat. But both tools also attest to the fact that there isn't much reading or writing going on.  According to pt-diskstats, there isn't any. So my question is: what is ceph doing? It clearly isn't just blocking as a result of excess I/O load, something else is going on. Can anyone please explain?
> 
> Jeff
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com