AFAIR, there is a feature request in the works to allow rebuild with K chunks, but not allow normal read/write until min_size is met. Not that I think running with m=1 is a good idea. I'm not seeing the tracker issue for it at the moment, though. -- Adam On Tue, Dec 11, 2018 at 9:50 PM Ashley Merrick <singapore@xxxxxxxxxxxxxx> wrote: > > Yes if you set back to 5, every time your loose an OSD your have to set to 4 and let the rebuild take place before putting back to 5. > > I guess is all down to how important 100% up time is over you manually monitoring the back fill / fix the OSD / replace the OSD by dropping to 4 vs letting it do this automatically and risk a further OSD loss. > > If you have the space ID suggest going to 4 + 2 and migrating your data, this would remove the ongoing issue and give you some extra data protection from OSD loss. > > On Wed, Dec 12, 2018 at 11:43 AM David Young <funkypenguin@xxxxxxxxxxxxxx> wrote: >> >> (accidentally forgot to reply to the list) >> >> Thank you, setting min_size to 4 allowed I/O again, and the 39 incomplete PGs are now: >> >> 39 active+undersized+degraded+remapped+backfilling >> >> Once backfilling is done, I'll increase min_size to 5 again. >> >> Am I likely to encounter this issue whenever I loose an OSD (I/O freezes and manually reducing size is required), and is there anything I should be doing differently? >> >> Thanks again! >> D >> >> >> >> Sent with ProtonMail Secure Email. >> >> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ >> On Wednesday, December 12, 2018 3:31 PM, Ashley Merrick <singapore@xxxxxxxxxxxxxx> wrote: >> >> With EC the min size is set to K + 1. >> >> Generally EC is used with a M of 2 or more, reason min size is set to 1 is now you are in a state when a further OSD loss will cause some PG’s to not have at least K size available as you only have 1 extra M. >> >> As per the error you can get your pool back online by setting min_size to 4. >> >> However this would only be a temp fix while you get the OSD back online / rebuilt so you can go back to your 4 + 1 state. >> >> ,Ash >> >> On Wed, 12 Dec 2018 at 10:27 AM, David Young <funkypenguin@xxxxxxxxxxxxxx> wrote: >>> >>> Hi all, >>> >>> I have a small 2-node cluster with 40 OSDs, using erasure coding 4+1 >>> >>> I lost osd38, and now I have 39 incomplete PGs. >>> >>> --- >>> PG_AVAILABILITY Reduced data availability: 39 pgs inactive, 39 pgs incomplete >>> pg 22.2 is incomplete, acting [19,33,10,8,29] (reducing pool media min_size from 5 may help; search ceph.com/docs for 'incomplete') >>> pg 22.f is incomplete, acting [17,9,23,14,15] (reducing pool media from 5 may help; search ceph.com/docs for 'incomplete') >>> pg 22.12 is incomplete, acting [7,33,10,31,29] (reducing pool media min_size from 5 may help; search ceph.com/docs for 'incomplete') >>> pg 22.13 is incomplete, acting [23,0,15,33,13] (reducing pool media min_size from 5 may help; search ceph.com/docs for 'incomplete') >>> pg 22.23 is incomplete, acting [29,17,18,15,12] (reducing pool media min_size from 5 may help; search ceph.com/docs for 'incomplete') >>> <snip> >>> --- >>> >>> My EC profile is below: >>> >>> --- >>> root@prod1:~# ceph osd erasure-code-profile get ec-41-profile >>> crush-device-class= >>> crush-failure-domain=osd >>> crush-root=default >>> jerasure-per-chunk-alignment=false >>> k=4 >>> m=1 >>> plugin=jerasure >>> technique=reed_sol_van >>> w=8 >>> --- >>> >>> When I query one of the incomplete PGs, I see this: >>> >>> --- >>> "recovery_state": [ >>> { >>> "name": "Started/Primary/Peering/Incomplete", >>> "enter_time": "2018-12-11 20:46:11.645796", >>> "comment": "not enough complete instances of this PG" >>> }, >>> --- >>> >>> And this: >>> >>> --- >>> "probing_osds": [ >>> "0(4)", >>> "7(2)", >>> "9(1)", >>> "11(4)", >>> "22(3)", >>> "29(2)", >>> "36(0)" >>> ], >>> "down_osds_we_would_probe": [ >>> 38 >>> ], >>> "peering_blocked_by": [] >>> }, >>> --- >>> >>> I have set this in /etc/ceph/ceph.conf to no effect: >>> osd_find_best_info_ignore_history_les = true >>> >>> >>> As a result of the incomplete PGs, I/O is currently frozen to at last part of my cephfs. >>> >>> I expected to be able to tolerate the loss of an OSD without issue, is there anything I can do to restore these incomplete PGs? >>> >>> When I bring back a new osd38, I see: >>> --- >>> "probing_osds": [ >>> "4(2)", >>> "11(3)", >>> "22(1)", >>> "24(1)", >>> "26(2)", >>> "36(4)", >>> "38(1)", >>> "39(0)" >>> ], >>> "down_osds_we_would_probe": [], >>> "peering_blocked_by": [] >>> }, >>> { >>> "name": "Started", >>> "enter_time": "2018-12-11 21:06:35.307379" >>> } >>> --- >>> >>> But my recovery state is still: >>> >>> --- >>> "recovery_state": [ >>> { >>> "name": "Started/Primary/Peering/Incomplete", >>> "enter_time": "2018-12-11 21:06:35.320292", >>> "comment": "not enough complete instances of this PG" >>> }, >>> --- >>> >>> Any ideas? >>> >>> Thanks! >>> D >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com