Re: New cephfs cluster performance issues- Jewel - cache pressure, capability release, poor iostat await avg queue size

John Spray <jspray@xxxxxxxxxx> · Wed, 19 Oct 2016 15:10:04 +0100

On Wed, Oct 19, 2016 at 1:28 PM, Jim Kilborn <jim@xxxxxxxxxxxx> wrote:
> I have setup a new linux cluster to allow migration from our old SAN based cluster to a new cluster with ceph.
> All systems running centos 7.2 with the 3.10.0-327.36.1 kernel.
> I am basically running stock ceph settings, with just turning the write cache off via hdparm on the drives, and temporarily turning of scrubbing.
>
> The 4 ceph servers are all Dell 730XD with 128GB memory, and dual xeon. So Server performance should be good.  Since I am running cephfs, I have tiering setup.
> Each server has 4 – 4TB drives for the erasure code pool, with K=3 and M=1. So the idea is to ensure a single host failure.
> Each server also has a 1TB Seagate 850 Pro SSD for the cache drive, in a replicated set with size=2
> The cache tier also has a 128GB SM863 SSD that is being used as a journal for the cache SSD. It has power loss protection
> My crush map is setup to ensure the cache pool uses only the 4 850 pro and the erasure code uses only the 16 spinning 4TB drives.
>
> The problems that I am seeing is that I start copying data from our old san to the ceph volume, and once the cache tier gets to my  target_max_bytes of 1.4 TB, I start seeing:
>
> HEALTH_WARN 63 requests are blocked > 32 sec; 1 osds have slow requests; noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
> 26 ops are blocked > 65.536 sec on osd.0
> 37 ops are blocked > 32.768 sec on osd.0
> 1 osds have slow requests
> noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
>
> osd.0 is the cache ssd
>
> If I watch iostat on the cache ssd, I see the queue lengths are high and the await are high
> Below is the iostat on the cache drive (osd.0) on the first host. The avgqu-sz is between 87 and 182 and the await is between 88ms and 1193ms
>
> Device:   rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> sdb
>                   0.00     0.33    9.00   84.33     0.96    20.11   462.40    75.92  397.56  125.67  426.58  10.70  99.90
>                   0.00     0.67   30.00   87.33     5.96    21.03   471.20    67.86  910.95   87.00 1193.99   8.27  97.07
>                   0.00    16.67   33.00  289.33     4.21    18.80   146.20    29.83   88.99   93.91   88.43   3.10  99.83
>                   0.00     7.33    7.67  261.67     1.92    19.63   163.81   117.42  331.97  182.04  336.36   3.71 100.00
>
>
> If I look at the iostat for all the drives, only the cache ssd drive is backed up
>
> Device:   rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> Sdg (journal for cache drive)
>                   0.00     6.33    0.00    8.00     0.00     0.07    19.04     0.00    0.33    0.00    0.33   0.33   0.27
> Sdb (cache drive)
>                   0.00     0.33    3.33   82.00     0.83    20.07   501.68   106.75 1057.81  269.40 1089.86  11.72 100.00
> Sda (4TB EC)
>                   0.00     0.00    0.00    4.00     0.00     0.02     9.33     0.00    0.00    0.00    0.00   0.00   0.00
> Sdd (4TB EC)
>                   0.00     0.00    0.00    2.33     0.00     0.45   392.00     0.08   34.00    0.00   34.00   6.86   1.60
> Sdf (4TB EC)
>                   0.00    14.00    0.00   26.00     0.00     0.22    17.71     1.00   38.55    0.00   38.55   0.68   1.77
> Sdc (4TB EC)
>                   0.00     0.00    0.00    1.33     0.00     0.01     8.75     0.02   12.25    0.00   12.25  12.25   1.63
>
> While at this time is just complaining about slow osd.0, sometimes the other cache tier ssds show some slow response, but not as frequently.
>
>
> I occasionally see complaints about a client not responding to cache pressure, and yesterday while copying serveral terabytes, the client doing the copy was noted for failing to respond to capability release, and I ended up rebooting it.
>
> I just seems the cluster isn’t handling large amounts of data copies, like and nfs or san based volume would, and I am worried about moving our users to a cluster that already is showing signs of performance issues, even when I am just doing a copy with no other users. I am doing only one rsync at a time.
>
> Is the problem that I need to user a later kernel for the clients mounting the volume ? I have read some posts about that, but the docs say centos 7 with 3.10 is ok.

Which docs say to use the stock centos kernel?  >4.4 is recommended
here: http://docs.ceph.com/docs/master/cephfs/best-practices/

> Do I need more drives in my cache pool? I only have 4 ssd drive in the cache pool (one on each host), with each having a separate journal drive.
> But is that too much of a hot spot since all i/o has to go to the cache layer?
> It seems like my ssds should be able to keep up with a single rsync copy.
> Is there something set wrong on my ssds that they cant keep up?

Cache tiering is pretty sensitive to tuning of various parameters, and
does not necessarily work well out of the box.  We do not test CephFS
with cache tiering.  CephFS also has some behaviours that don't mesh
very nicely with cache tiering, like periodically doing small writes
to xattrs that would cause an entire data object to be promoted.

> I put the metadata pool on the ssd cache tier drives as well.

Hmm, seems like a dubious choice: all data operations are going
through those SSDs, *plus* the operations to promote/flush data to the
spinners -- we do not do any prioritisation internally so your
metadata ops are potentially going to be quite impacted by that.

> Any ideas where the problem is or what I need to change to make this stable?

Mainly I would suggest that you try changing some stuff and see what
effect it has:
 * Trying just running a replicated pool on spinners and use your SSDs
as journals (or use them for metadata only) -- obviously this does not
get you erasure coding, but it would narrow down where your issues are
coming from.
 * Try a more recent kernel (>4.4)

If you want to take a longer view of this, it might make sense for you
to use replicated storage (and no cache tier) for now, and by the time
your data has grown the Luminous release could be out with support for
using EC without a cache tier.

John

>
>
> Thanks. Additional details below
>
> The ceph osd tier drives are osd 0,  5,  10, 15
>
> ceph df
> GLOBAL:
>     SIZE       AVAIL      RAW USED     %RAW USED
>     63155G     50960G       12195G         19.31
> POOLS:
>     NAME                ID     USED       %USED     MAX AVAIL     OBJECTS
>     cephfs-metadata     15     58943k         0          523G       28436
>     cephfs-data         16      7382G     15.59        36499G     5013128
>     cephfs-cache        17      1124G      3.56          523G      298348
>
> ceph osd df
> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS
>  0 1.00000  1.00000   931G   669G   261G 71.88 3.72 280
>  5 1.00000  1.00000   931G   526G   404G 56.51 2.93 259
> 10 1.00000  1.00000   931G   589G   341G 63.35 3.28 266
> 15 1.00000  1.00000   931G   468G   462G 50.33 2.61 219
>  1 1.00000  1.00000  3714G   672G  3041G 18.11 0.94 139
>  2 1.00000  1.00000  3714G   593G  3120G 15.98 0.83 122
>  3 1.00000  1.00000  3714G   585G  3128G 15.77 0.82 121
>  4 1.00000  1.00000  3714G   632G  3081G 17.04 0.88 130
>  6 1.00000  1.00000  3714G   589G  3124G 15.87 0.82 120
>  7 1.00000  1.00000  3714G   627G  3087G 16.89 0.87 130
>  8 1.00000  1.00000  3714G   638G  3076G 17.18 0.89 132
>  9 1.00000  1.00000  3714G   630G  3083G 16.98 0.88 130
> 11 1.00000  1.00000  3714G   646G  3067G 17.41 0.90 132
> 12 1.00000  1.00000  3714G   580G  3133G 15.63 0.81 120
> 13 1.00000  1.00000  3714G   624G  3089G 16.82 0.87 129
> 14 1.00000  1.00000  3714G   633G  3081G 17.05 0.88 131
> 16 1.00000  1.00000  3714G   589G  3124G 15.88 0.82 122
> 17 1.00000  1.00000  3714G   624G  3090G 16.80 0.87 129
> 18 1.00000  1.00000  3714G   614G  3099G 16.55 0.86 127
> 19 1.00000  1.00000  3714G   656G  3057G 17.68 0.92 134
> 15 1.00000  1.00000   931G   468G   462G 50.33 2.61 219
> 16 1.00000  1.00000  3714G   589G  3124G 15.88 0.82 122
> 17 1.00000  1.00000  3714G   624G  3090G 16.80 0.87 129
> 18 1.00000  1.00000  3714G   614G  3099G 16.55 0.86 127
> 19 1.00000  1.00000  3714G   656G  3057G 17.68 0.92 134
> 10 1.00000  1.00000   931G   589G   341G 63.35 3.28 266
> 11 1.00000  1.00000  3714G   646G  3067G 17.41 0.90 132
> 12 1.00000  1.00000  3714G   580G  3133G 15.63 0.81 120
> 13 1.00000  1.00000  3714G   624G  3089G 16.82 0.87 129
> 14 1.00000  1.00000  3714G   633G  3081G 17.05 0.88 131
>  5 1.00000  1.00000   931G   526G   404G 56.51 2.93 259
>  6 1.00000  1.00000  3714G   589G  3124G 15.87 0.82 120
>  7 1.00000  1.00000  3714G   627G  3087G 16.89 0.87 130
>  8 1.00000  1.00000  3714G   638G  3076G 17.18 0.89 132
>  9 1.00000  1.00000  3714G   630G  3083G 16.98 0.88 130
>  0 1.00000  1.00000   931G   669G   261G 71.88 3.72 280
>  1 1.00000  1.00000  3714G   672G  3041G 18.11 0.94 139
>  2 1.00000  1.00000  3714G   593G  3120G 15.98 0.83 122
>  3 1.00000  1.00000  3714G   585G  3128G 15.77 0.82 121
>  4 1.00000  1.00000  3714G   632G  3081G 17.04 0.88 130
>               TOTAL 63155G 12195G 50959G 19.31
>
> ceph osd tree
> ID  WEIGHT   TYPE NAME                             UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -14  4.00000 root ssd
>  -1  1.00000     disktype darkjedi-ceph01_ssd
>   0  1.00000         osd.0                              up  1.00000          1.00000
>  -4  1.00000     disktype darkjedi-ceph02_ssd
>   5  1.00000         osd.5                              up  1.00000          1.00000
>  -7  1.00000     disktype darkjedi-ceph03_ssd
>  10  1.00000         osd.10                             up  1.00000          1.00000
> -10  1.00000     disktype darkjedi-ceph04_ssd
>  15  1.00000         osd.15                             up  1.00000          1.00000
> -13 16.00000 root spinning
>  -2  4.00000     disktype darkjedi-ceph01_spinning
>   1  1.00000         osd.1                              up  1.00000          1.00000
>   2  1.00000         osd.2                              up  1.00000          1.00000
>   3  1.00000         osd.3                              up  1.00000          1.00000
>   4  1.00000         osd.4                              up  1.00000          1.00000
>  -5  4.00000     disktype darkjedi-ceph02_spinning
>   6  1.00000         osd.6                              up  1.00000          1.00000
>   7  1.00000         osd.7                              up  1.00000          1.00000
>   8  1.00000         osd.8                              up  1.00000          1.00000
>   9  1.00000         osd.9                              up  1.00000          1.00000
>  -8  4.00000     disktype darkjedi-ceph03_spinning
>  11  1.00000         osd.11                             up  1.00000          1.00000
>  12  1.00000         osd.12                             up  1.00000          1.00000
>  13  1.00000         osd.13                             up  1.00000          1.00000
>  14  1.00000         osd.14                             up  1.00000          1.00000
> -11  4.00000     disktype darkjedi-ceph04_spinning
>  16  1.00000         osd.16                             up  1.00000          1.00000
>  17  1.00000         osd.17                             up  1.00000          1.00000
>  18  1.00000         osd.18                             up  1.00000          1.00000
>  19  1.00000         osd.19                             up  1.00000          1.00000
> -12  5.00000 host darkjedi-ceph04
> -10  1.00000     disktype darkjedi-ceph04_ssd
>  15  1.00000         osd.15                             up  1.00000          1.00000
> -11  4.00000     disktype darkjedi-ceph04_spinning
>  16  1.00000         osd.16                             up  1.00000          1.00000
>  17  1.00000         osd.17                             up  1.00000          1.00000
>  18  1.00000         osd.18                             up  1.00000          1.00000
>  19  1.00000         osd.19                             up  1.00000          1.00000
>  -9  5.00000 host darkjedi-ceph03
>  -7  1.00000     disktype darkjedi-ceph03_ssd
>  10  1.00000         osd.10                             up  1.00000          1.00000
>  -8  4.00000     disktype darkjedi-ceph03_spinning
>  11  1.00000         osd.11                             up  1.00000          1.00000
>  12  1.00000         osd.12                             up  1.00000          1.00000
>  13  1.00000         osd.13                             up  1.00000          1.00000
>  14  1.00000         osd.14                             up  1.00000          1.00000
>  -6  6.00000 host darkjedi-ceph02
>  -4  2.00000     disktype darkjedi-ceph02_ssd
>   5  1.00000         osd.5                              up  1.00000          1.00000
>  -5  4.00000     disktype darkjedi-ceph02_spinning
>   6  1.00000         osd.6                              up  1.00000          1.00000
>   7  1.00000         osd.7                              up  1.00000          1.00000
>   8  1.00000         osd.8                              up  1.00000          1.00000
>   9  1.00000         osd.9                              up  1.00000          1.00000
>  -3  5.00000 host darkjedi-ceph01
>  -1  1.00000     disktype darkjedi-ceph01_ssd
>   0  1.00000         osd.0                              up  1.00000          1.00000
>  -2  4.00000     disktype darkjedi-ceph01_spinning
>   1  1.00000         osd.1                              up  1.00000          1.00000
>   2  1.00000         osd.2                              up  1.00000          1.00000
>   3  1.00000         osd.3                              up  1.00000          1.00000
>   4  1.00000         osd.4                              up  1.00000          1.00000
>
>
> ceph -w
>     cluster 62ed97d6-adf4-12e4-8fd5-3d9701b22b86
>      health HEALTH_WARN
>             72 requests are blocked > 32 sec
>             noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
>      monmap e2: 3 mons at {darkjedi-ceph01=192.168.19.241:6789/0,darkjedi-ceph02=192.168.19.242:6789/0,darkjedi-ceph03=192.168.19.243:6789/0}
>             election epoch 120, quorum 0,1,2 darkjedi-ceph01,darkjedi-ceph02,darkjedi-ceph03
>       fsmap e1156: 1/1/1 up {0=darkjedi-ceph04=up:active}, 1 up:standby
>      osdmap e4294: 20 osds: 20 up, 20 in
>             flags noout,noscrub,nodeep-scrub,sortbitwise
>       pgmap v674643: 1024 pgs, 3 pools, 8513 GB data, 5216 kobjects
>             12204 GB used, 50950 GB / 63155 GB avail
>                 1024 active+clean
>   client io 4039 kB/s wr, 0 op/s rd, 4 op/s wr
>   cache io 10080 kB/s flush, 8064 kB/s evict, 0 op/s promote, 1 PG(s) evicting
>
> 2016-10-19 07:27:42.063881 mon.0 [INF] pgmap v674642: 1024 pgs: 1024 active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 10093 kB/s wr, 4 op/s
> 2016-10-19 07:27:38.539687 osd.0 [WRN] 47 slow requests, 1 included below; oldest blocked for > 87.453888 secs
> 2016-10-19 07:27:38.539695 osd.0 [WRN] slow request 30.771983 seconds old, received at 2016-10-19 07:27:07.767636: osd_op(client.1234553.1:412446 17.6f7367c 10000323d95.00000041 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:39.539883 osd.0 [WRN] 48 slow requests, 2 included below; oldest blocked for > 88.454076 secs
> 2016-10-19 07:27:39.539890 osd.0 [WRN] slow request 60.016825 seconds old, received at 2016-10-19 07:26:39.522982: osd_op(client.1234553.1:412252 17.1220dbd9 10000323d93.0000001b [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:39.539897 osd.0 [WRN] slow request 30.063037 seconds old, received at 2016-10-19 07:27:09.476770: osd_op(client.1234553.1:412458 17.dac4885c 10000323d95.0000004d [write 0~2260992 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:40.540136 osd.0 [WRN] 53 slow requests, 5 included below; oldest blocked for > 89.454282 secs
> 2016-10-19 07:27:40.540148 osd.0 [WRN] slow request 30.426319 seconds old, received at 2016-10-19 07:27:10.113694: osd_op(client.1234553.1:412483 17.dd3448b3 10000323d96.00000018 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:40.540153 osd.0 [WRN] slow request 30.557129 seconds old, received at 2016-10-19 07:27:09.982884: osd_op(client.1234553.1:412471 17.6dae7fe6 10000323d96.0000000c [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:40.540160 osd.0 [WRN] slow request 60.947034 seconds old, received at 2016-10-19 07:26:39.592979: osd_op(client.1234553.1:412259 17.5fecfad9 10000323d93.00000022 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:40.540165 osd.0 [WRN] slow request 30.218226 seconds old, received at 2016-10-19 07:27:10.321787: osd_op(client.1234553.1:412487 17.7e783a0b 10000323d96.0000001c [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:40.540173 osd.0 [WRN] slow request 30.669424 seconds old, received at 2016-10-19 07:27:09.870589: osd_op(client.1234553.1:412460 17.242c68e6 10000323d96.00000001 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:41.540415 osd.0 [WRN] 56 slow requests, 4 included below; oldest blocked for > 90.454559 secs
> 2016-10-19 07:27:41.540426 osd.0 [WRN] slow request 30.591025 seconds old, received at 2016-10-19 07:27:10.949264: osd_op(client.1234553.1:412490 17.f00571c7 10000323d96.0000001f [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:41.540431 osd.0 [WRN] slow request 30.584468 seconds old, received at 2016-10-19 07:27:10.955822: osd_op(client.1234553.1:412493 17.e5c93438 10000323d96.00000022 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:41.540438 osd.0 [WRN] slow request 30.570790 seconds old, received at 2016-10-19 07:27:10.969499: osd_op(client.1234553.1:412499 17.75f41c27 10000323d96.00000028 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:41.540445 osd.0 [WRN] slow request 31.588275 seconds old, received at 2016-10-19 07:27:09.952014: osd_op(client.1234553.1:412463 17.e66e96b6 10000323d96.00000004 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:42.540685 osd.0 [WRN] 56 slow requests, 4 included below; oldest blocked for > 91.454840 secs
> 2016-10-19 07:27:42.540694 osd.0 [WRN] slow request 60.595040 seconds old, received at 2016-10-19 07:26:41.945531: osd_op(client.1234553.1:412276 17.cae2b8f3 10000323d93.00000033 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:42.540702 osd.0 [WRN] slow request 30.032041 seconds old, received at 2016-10-19 07:27:12.508530: osd_op(client.1234553.1:412521 17.ec0d7938 10000323d96.0000003e [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:42.540716 osd.0 [WRN] slow request 60.589088 seconds old, received at 2016-10-19 07:26:41.951483: osd_op(client.1234553.1:412277 17.10107d9a 10000323d93.00000034 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently waiting for blocked object
> 2016-10-19 07:27:42.540726 osd.0 [WRN] slow request 60.573651 seconds old, received at 2016-10-19 07:26:41.966920: osd_op(client.1234553.1:412287 17.cac5d907 10000323d93.0000003e [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:43.073014 mon.0 [INF] pgmap v674643: 1024 pgs: 1024 active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 4039 kB/s wr, 4 op/s
> 2016-10-19 07:27:43.147883 mon.0 [INF] HEALTH_WARN; 72 requests are blocked > 32 sec; noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
> 2016-10-19 07:27:43.540977 osd.0 [WRN] 60 slow requests, 5 included below; oldest blocked for > 92.455114 secs
> 2016-10-19 07:27:43.540987 osd.0 [WRN] slow request 30.946896 seconds old, received at 2016-10-19 07:27:12.593949: osd_op(client.1234553.1:412553 17.780b5c32 10000323d97.00000010 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:43.540992 osd.0 [WRN] slow request 30.940430 seconds old, received at 2016-10-19 07:27:12.600414: osd_op(client.1234553.1:412554 17.62f5a288 10000323d97.00000011 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:43.541002 osd.0 [WRN] slow request 30.930237 seconds old, received at 2016-10-19 07:27:12.610608: osd_op(client.1234553.1:412557 17.e6fe4157 10000323d97.00000014 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:43.541009 osd.0 [WRN] slow request 30.669228 seconds old, received at 2016-10-19 07:27:12.871617: osd_op(client.1234553.1:412568 17.dd1ccfe0 10000323d97.0000001f [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:43.541018 osd.0 [WRN] slow request 60.284696 seconds old, received at 2016-10-19 07:26:43.256148: osd_op(client.1234553.1:412302 17.dcfdcaad 10000323d93.0000004d [write 0~2260992 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:44.083654 mon.0 [INF] pgmap v674644: 1024 pgs: 1024 active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 4059 kB/s wr, 2 op/s
> 2016-10-19 07:27:45.105055 mon.0 [INF] pgmap v674645: 1024 pgs: 1024 active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 26360 kB/s wr, 6 op/s
> 2016-10-19 07:27:46.116200 mon.0 [INF] pgmap v674646: 1024 pgs: 1024 active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 90603 kB/s wr, 23 op/s
> 2016-10-19 07:27:47.126401 mon.0 [INF] pgmap v674647: 1024 pgs: 1024 active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 114 MB/s wr, 38 op/s
> 2016-10-19 07:27:48.145239 mon.0 [INF] pgmap v674648: 1024 pgs: 1024 active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 100 MB/s wr, 49 op/s
> 2016-10-19 07:27:49.153554 mon.0 [INF] pgmap v674649: 1024 pgs: 1024 active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 64589 kB/s wr, 30 op/s
> 2016-10-19 07:27:50.158547 mon.0 [INF] pgmap v674650: 1024 pgs: 1024 active+clean; 8514 GB data, 12205 GB used, 50950 GB / 63155 GB avail; 34499 kB/s wr, 8 op/s
> 2016-10-19 07:27:51.175699 mon.0 [INF] pgmap v674651: 1024 pgs: 1024 active+clean; 8514 GB data, 12205 GB used, 50949 GB / 63155 GB avail; 79100 kB/s wr, 19 op/s
> 2016-10-19 07:27:52.180504 mon.0 [INF] pgmap v674652: 1024 pgs: 1024 active+clean; 8514 GB data, 12205 GB used, 50950 GB / 63155 GB avail; 95241 kB/s wr, 71 op/s
> 2016-10-19 07:27:44.541269 osd.0 [WRN] 59 slow requests, 3 included below; oldest blocked for > 93.455423 secs
> 2016-10-19 07:27:44.541276 osd.0 [WRN] slow request 31.985270 seconds old, received at 2016-10-19 07:27:12.555884: osd_op(client.1234553.1:412536 17.30639c12 10000323d96.0000004d [write 0~2260992 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:44.541282 osd.0 [WRN] slow request 31.980479 seconds old, received at 2016-10-19 07:27:12.560674: osd_op(client.1234553.1:412537 17.5c9321a8 10000323d97.00000000 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:44.541287 osd.0 [WRN] slow request 31.954483 seconds old, received at 2016-10-19 07:27:12.586670: osd_op(client.1234553.1:412545 17.49eb1b07 10000323d97.00000008 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:50.542250 osd.0 [WRN] 53 slow requests, 1 included below; oldest blocked for > 99.456434 secs
> 2016-10-19 07:27:50.542261 osd.0 [WRN] slow request 30.322487 seconds old, received at 2016-10-19 07:27:20.219678: osd_op(client.1234553.1:412574 17.6c703de 10000323d97.00000025 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:51.542498 osd.0 [WRN] 50 slow requests, 3 included below; oldest blocked for > 100.456658 secs
> 2016-10-19 07:27:51.542505 osd.0 [WRN] slow request 60.740689 seconds old, received at 2016-10-19 07:26:50.801700: osd_op(client.1234553.1:412320 17.802362e0 10000323d94.00000011 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:51.542513 osd.0 [WRN] slow request 60.574101 seconds old, received at 2016-10-19 07:26:50.968288: osd_op(client.1234553.1:412321 17.c529e2e6 10000323d94.00000012 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:51.542518 osd.0 [WRN] slow request 30.452691 seconds old, received at 2016-10-19 07:27:21.089698: osd_op(client.1234553.1:412576 17.65178e57 10000323d97.00000027 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:52.542740 osd.0 [WRN] 46 slow requests, 4 included below; oldest blocked for > 101.456908 secs
> 2016-10-19 07:27:52.542746 osd.0 [WRN] slow request 30.674005 seconds old, received at 2016-10-19 07:27:21.868634: osd_op(client.1234553.1:412582 17.722c7ade 10000323d97.0000002d [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:52.542757 osd.0 [WRN] slow request 60.439945 seconds old, received at 2016-10-19 07:26:52.102693: osd_op(client.1234553.1:412333 17.f8df72c7 10000323d94.0000001e [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:52.542763 osd.0 [WRN] slow request 30.875550 seconds old, received at 2016-10-19 07:27:21.667089: osd_op(client.1234553.1:412581 17.13ef9fd2 10000323d97.0000002c [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:52.542770 osd.0 [WRN] slow request 60.317892 seconds old, received at 2016-10-19 07:26:52.224747: osd_op(client.1234553.1:412338 17.7416d4bb 10000323d94.00000023 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently waiting for blocked object
> 2016-10-19 07:27:53.209272 mon.0 [INF] pgmap v674653: 1024 pgs: 1024 active+clean; 8514 GB data, 12205 GB used, 50949 GB / 63155 GB avail; 73115 kB/s wr, 75 op/s
> 2016-10-19 07:27:53.542922 osd.0 [WRN] 45 slow requests, 1 included below; oldest blocked for > 98.523012 secs
> 2016-10-19 07:27:53.542934 osd.0 [WRN] slow request 30.691735 seconds old, received at 2016-10-19 07:27:22.851117: osd_op(client.1234553.1:412586 17.905a3e43 10000323d97.00000031 [write 0~4194304 [1@-1]] snapc 1=[] ondisk+write e4294) currently reached_pg
> 2016-10-19 07:27:54.219455 mon.0 [INF] pgmap v674654: 1024 pgs: 1024 active+clean; 8514 GB data, 12205 GB used, 50949 GB / 63155 GB avail; 50465 kB/s wr, 21 op/s
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com