First, your ec profile output is 4+2, not 8+2:
ceph osd erasure-code-profile get my-ec-profile02
crush-device-class=
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=false
k=4
m=2
But according to your health output there are 10 chunks, so 8+2 it
probably is. You have to be aware that EC pools usually have a
min_size of k+1, so in your case 9 chunks are required to keep the
pool active. You can only delete one OSD at a time.
You can temporarily reduce min_size to k, but don't forget to set it
back to k+1. It's a safety switch, and it's there for a reason.
Zitat von 苏察哈尔灿 <2644294460@xxxxxx>:
I am deleting two OSD nodes on a node, and ceph health detail
indicates that there are 15 pgs inactive, as follows:
[WRN] PG_AVAILABILITY: Reduced data availability: 15 pgs inactive
pg 37.4 is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[NONE,40,38,88,0,22,62,32,NONE,21]
pg 37.a is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[38,103,NONE,57,20,NONE,63,51,8,18]
pg 37.c is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[58,7,18,38,52,101,80,NONE,62,NONE]
pg 37.62 is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[67,101,NONE,57,62,NONE,7,26,14,20]
pg 37.97 is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[35,19,NONE,13,58,17,20,6,25,NONE]
pg 37.9d is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[103,22,NONE,63,NONE,34,45,56,90,14]
pg 37.b4 is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[61,NONE,NONE,46,86,60,9,66,21,35]
pg 37.136 is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[10,NONE,24,78,48,89,NONE,56,8,51]
pg 37.13a is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[71,33,NONE,14,32,27,NONE,5,25,57]
pg 37.149 is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[6,69,65,84,27,25,NONE,0,14,NONE]
pg 37.15c is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[103,66,6,23,NONE,NONE,61,45,8,42]
pg 37.16d is stuck inactive for 113m, current state
recovering+undersized+remapped+peered, last acting
[79,68,26,66,58,NONE,52,8,NONE,60]
pg 37.17b is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[80,47,27,14,24,NONE,67,35,NONE,5]
pg 37.1d0 is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[0,76,68,57,NONE,10,48,85,101,NONE]
pg 37.1ea is stuck inactive for 113m, current state
recovery_wait+undersized+degraded+remapped+peered, last acting
[28,24,52,20,10,43,56,NONE,NONE,80]
My pool is used using 8+2 erasure-code configuration:
ceph osd erasure-code-profile get my-ec-profile02
crush-device-class=
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=false
k=4
m=2
plugin=jerasure
technique=reed_sol_van
w=8
In theory, it should be possible to access the corresponding pool,
but in fact, the client cannot access normally, or it will be stuck
in the middle of the access. Why?
Then, when using the command "rados -p cephfs-backup.data-ec ls" to
display the pool object, it also stuck
10000000e74.00000cd8
1000000080f.00004436
10000000e3d.000012e3
10000000e50.000018a2
1000000081b.00005b22
100000003d3.0000777a
100000003d7.00003c64
Cursor motionless
thank you!
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx