Hello, 2018-09-20 09:32:58.851160 mon.dri-ceph01 [WRN] Health check update: 249 PGs pending on creation (PENDING_CREATING_PGS) This error might indicate that you are hitting a PG limit per osd. Here some information on it https://ceph.com/community/new-luminous-pg-overdose-protection/ . You might need to increase mon_max_pg_per_osd for OSD to start balancing out. On Thu, Sep 20, 2018 at 2:25 PM Jaime Ibar <jaime@xxxxxxxxxxxx> wrote: > > Hi all, > > we recently upgrade from Jewel 10.2.10 to Luminous 12.2.7, now we're trying to migrate the > > osd's to Bluestore following this document[0], however when I mark the osd as out, > > I'm getting warnings similar to these ones > > 2018-09-20 09:32:46.079630 mon.dri-ceph01 [WRN] Health check failed: 2 slow requests are blocked > 32 sec. Implicated osds 16,28 (REQUEST_SLOW) > 2018-09-20 09:32:52.841123 mon.dri-ceph01 [WRN] Health check update: 7 slow requests are blocked > 32 sec. Implicated osds 10,16,28,32,59 (REQUEST_SLOW) > 2018-09-20 09:32:57.842230 mon.dri-ceph01 [WRN] Health check update: 15 slow requests are blocked > 32 sec. Implicated osds 10,16,28,31,32,59,78,80 (REQUEST_SLOW) > > 2018-09-20 09:32:58.851142 mon.dri-ceph01 [WRN] Health check update: 244944/40100780 objects misplaced (0.611%) (OBJECT_MISPLACED) > 2018-09-20 09:32:58.851160 mon.dri-ceph01 [WRN] Health check update: 249 PGs pending on creation (PENDING_CREATING_PGS) > > which prevent ceph start rebalancing and the vm's running on ceph start hanging and we have to mark the osd back in. > > I tried to reweight the osd to 0.90 in order to minimize the impact on the cluster but the warnings are the same. > > I tried to increased these settings to > > mds cache memory limit = 2147483648 > rocksdb cache size = 2147483648 > > but with no luck, same warnings. > > We also have cephfs for storing files from different projects(no directory fragmentation enabled). > > The problem here is that if one osd dies, all the services will be blocked as ceph won't be able to > > start rebalancing. > > The cluster is > > - 3 mons > > - 3 mds(running on the same hosts as the mons). 2 mds active and 1 standby > > - 3 mgr(running on the same hosts as the mons) > > - 6 servers, 12 osd's each. > > - 1GB private network > > > Does anyone know how to fix or where the problem could be? > > Thanks a lot in advance. > > Jaime > > > [0] http://docs.ceph.com/docs/luminous/rados/operations/bluestore-migration/ > > -- > > Jaime Ibar > High Performance & Research Computing, IS Services > Lloyd Building, Trinity College Dublin, Dublin 2, Ireland. > http://www.tchpc.tcd.ie/ | jaime@xxxxxxxxxxxx > Tel: +353-1-896-3725 > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com