Thanks for the response, gregory.
We need to support a couple of production services we have migrated to ceph. So we are in a bit of soup.
cluster is as follows:
```
ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 11.06848 root default
-7 5.45799 host master
5 hdd 5.45799 osd.5 up 1.00000 1.00000
-5 1.81940 host node2
7 hdd 1.81940 osd.7 up 1.00000 1.00000
-3 1.81940 host node3
8 hdd 1.81940 osd.8 up 1.00000 1.00000
-9 1.81940 host node4
6 hdd 1.81940 osd.6 up 1.00000 1.00000
-11 0.15230 host node5
9 hdd 0.15230 osd.9 up 1.00000 1.00000
```
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 11.06848 root default
-7 5.45799 host master
5 hdd 5.45799 osd.5 up 1.00000 1.00000
-5 1.81940 host node2
7 hdd 1.81940 osd.7 up 1.00000 1.00000
-3 1.81940 host node3
8 hdd 1.81940 osd.8 up 1.00000 1.00000
-9 1.81940 host node4
6 hdd 1.81940 osd.6 up 1.00000 1.00000
-11 0.15230 host node5
9 hdd 0.15230 osd.9 up 1.00000 1.00000
```
We have installed ceph cluster and kubernetes cluster on the same nodes (centos 7).
We were facing low perf from ceph cluster ~10.5MB/S ```dd if=/dev/zero | of=./here bs=1M count=1024 oflag=direct```
So, we were in the process of adding additional NIC to each node. rebooting each one by one, ensuring rebooted node works well and proceeding further.
After every few(a couple) reboots of nodes, mds would go down. (report data damage).We would following the disaster recovery link and it ll be merry.
a couple of days since, mds hasnt come up. disaster recovery doesnt work no more.
cluster conf:
```
[global]
fsid = 2ed909ef-e3d7-4081-b01a-d04d12a1155d
mon_initial_members = master, node3, node2
mon_host = 10.10.73.45,10.10.73.44,10.10.73.43
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
public_network= 10.10.73.0/24
osd pool default size = 2 # Write an object 3 times.
osd pool default min size = 2
mon allow pool delete = true
cluster network = 10.10.73.0/24
max open files = 131072
[mon]
mon data = "">
[osd]
osd data = "">osd journal size = 20000
osd mkfs type = xfs
osd mkfs options xfs = -f
filestore xattr use omap = true
filestore min sync interval = 10
filestore max sync interval = 15
filestore queue max ops = 25000
filestore queue max bytes = 10485760
filestore queue committing max ops = 5000
filestore queue committing max bytes = 10485760000
journal max write bytes = 1073714824
journal max write entries = 10000
journal queue max ops = 50000
journal queue max bytes = 10485760000
osd max write size = 512
osd client message size cap = 2147483648
osd deep scrub stride = 131072
osd op threads = 8
osd disk threads = 4
osd map cache size = 1024
osd map cache bl size = 128
mon allow pool delete = true
cluster network = 10.10.73.0/24
max open files = 131072
[mon]
mon data = "">
[osd]
osd data = "">osd journal size = 20000
osd mkfs type = xfs
osd mkfs options xfs = -f
filestore xattr use omap = true
filestore min sync interval = 10
filestore max sync interval = 15
filestore queue max ops = 25000
filestore queue max bytes = 10485760
filestore queue committing max ops = 5000
filestore queue committing max bytes = 10485760000
journal max write bytes = 1073714824
journal max write entries = 10000
journal queue max ops = 50000
journal queue max bytes = 10485760000
osd max write size = 512
osd client message size cap = 2147483648
osd deep scrub stride = 131072
osd op threads = 8
osd disk threads = 4
osd map cache size = 1024
osd map cache bl size = 128
osd mount options xfs = "rw,noexec,nodev,noatime,nodiratime,nobarrier"
osd recovery op priority = 4
osd recovery max active = 10
osd max backfills = 4
osd skip data digest = true
[client]
rbd cache = true
rbd cache size = 268435456
rbd cache max dirty = 134217728
rbd cache max dirty age = 5
```
fsid = 2ed909ef-e3d7-4081-b01a-d04d12a1155d
mon_initial_members = master, node3, node2
mon_host = 10.10.73.45,10.10.73.44,10.10.73.43
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
public_network= 10.10.73.0/24
osd pool default size = 2 # Write an object 3 times.
osd pool default min size = 2
mon allow pool delete = true
cluster network = 10.10.73.0/24
max open files = 131072
[mon]
mon data = "">
[osd]
osd data = "">osd journal size = 20000
osd mkfs type = xfs
osd mkfs options xfs = -f
filestore xattr use omap = true
filestore min sync interval = 10
filestore max sync interval = 15
filestore queue max ops = 25000
filestore queue max bytes = 10485760
filestore queue committing max ops = 5000
filestore queue committing max bytes = 10485760000
journal max write bytes = 1073714824
journal max write entries = 10000
journal queue max ops = 50000
journal queue max bytes = 10485760000
osd max write size = 512
osd client message size cap = 2147483648
osd deep scrub stride = 131072
osd op threads = 8
osd disk threads = 4
osd map cache size = 1024
osd map cache bl size = 128
mon allow pool delete = true
cluster network = 10.10.73.0/24
max open files = 131072
[mon]
mon data = "">
[osd]
osd data = "">osd journal size = 20000
osd mkfs type = xfs
osd mkfs options xfs = -f
filestore xattr use omap = true
filestore min sync interval = 10
filestore max sync interval = 15
filestore queue max ops = 25000
filestore queue max bytes = 10485760
filestore queue committing max ops = 5000
filestore queue committing max bytes = 10485760000
journal max write bytes = 1073714824
journal max write entries = 10000
journal queue max ops = 50000
journal queue max bytes = 10485760000
osd max write size = 512
osd client message size cap = 2147483648
osd deep scrub stride = 131072
osd op threads = 8
osd disk threads = 4
osd map cache size = 1024
osd map cache bl size = 128
osd mount options xfs = "rw,noexec,nodev,noatime,nodiratime,nobarrier"
osd recovery op priority = 4
osd recovery max active = 10
osd max backfills = 4
osd skip data digest = true
[client]
rbd cache = true
rbd cache size = 268435456
rbd cache max dirty = 134217728
rbd cache max dirty age = 5
```
ceph health:
```
master@~/ ceph -s
cluster:
id: 2ed909ef-e3d7-4081-b01a-d04d12a1155d
health: HEALTH_ERR
4 scrub errors
Possible data damage: 1 pg inconsistent
services:
mon: 3 daemons, quorum node2,node3,master
mgr: master(active)
mds: cephfs-1/1/1 up {0=master=up:active(laggy or crashed)}
osd: 5 osds: 5 up, 5 in
data:
pools: 2 pools, 300 pgs
objects: 194.1 k objects, 33 GiB
usage: 131 GiB used, 11 TiB / 11 TiB avail
pgs: 299 active+clean
1 active+clean+inconsistent
```
cluster:
id: 2ed909ef-e3d7-4081-b01a-d04d12a1155d
health: HEALTH_ERR
4 scrub errors
Possible data damage: 1 pg inconsistent
services:
mon: 3 daemons, quorum node2,node3,master
mgr: master(active)
mds: cephfs-1/1/1 up {0=master=up:active(laggy or crashed)}
osd: 5 osds: 5 up, 5 in
data:
pools: 2 pools, 300 pgs
objects: 194.1 k objects, 33 GiB
usage: 131 GiB used, 11 TiB / 11 TiB avail
pgs: 299 active+clean
1 active+clean+inconsistent
```
ceph health detail
```
ceph health detail
HEALTH_ERR 4 scrub errors; Possible data damage: 1 pg inconsistent
OSD_SCRUB_ERRORS 4 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
pg 1.43 is active+clean+inconsistent, acting [5,8,7]
```
HEALTH_ERR 4 scrub errors; Possible data damage: 1 pg inconsistent
OSD_SCRUB_ERRORS 4 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
pg 1.43 is active+clean+inconsistent, acting [5,8,7]
```
mds logs have already been provided. Sincerely appreciate reading through it all.
Thanks,
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com