Ok, so it's a replica 3 pool, and OSD 68 & 69 are on the same host. Le vendredi 21 septembre 2018 à 11:09 +0000, Eugen Block a écrit : > > cache-tier on this pool have 26GB of data (for 5.7TB of data on the > > EC > > pool). > > We tried to flush the cache tier, and restart OSD 68 & 69, without > > any > > success. > > I meant the replication size of the pool > > ceph osd pool ls detail | grep <CACHE_TIER> > > In the experimental state of our cluster we had a cache tier (for > rbd > pool) with size 2, that can cause problems during recovery. Since > only > OSDs 68 and 69 are mentioned I was wondering if your cache tier > also > has size 2. > > > Zitat von Olivier Bonvalet <ceph.list@xxxxxxxxx>: > > > Hi, > > > > cache-tier on this pool have 26GB of data (for 5.7TB of data on the > > EC > > pool). > > We tried to flush the cache tier, and restart OSD 68 & 69, without > > any > > success. > > > > But I don't see any related data on cache-tier OSD (filestore) with > > : > > > > find /var/lib/ceph/osd/ -maxdepth 3 -name '*37.9c*' > > > > > > I don't see any usefull information in logs. Maybe I should > > increase > > log level ? > > > > Thanks, > > > > Olivier > > > > > > Le vendredi 21 septembre 2018 à 09:34 +0000, Eugen Block a écrit : > > > Hi Olivier, > > > > > > what size does the cache tier have? You could set cache-mode to > > > forward and flush it, maybe restarting those OSDs (68, 69) helps, > > > too. > > > Or there could be an issue with the cache tier, what do those > > > logs > > > say? > > > > > > Regards, > > > Eugen > > > > > > > > > Zitat von Olivier Bonvalet <ceph.list@xxxxxxxxx>: > > > > > > > Hello, > > > > > > > > on a Luminous cluster, I have a PG incomplete and I can't find > > > > how > > > > to > > > > fix that. > > > > > > > > It's an EC pool (4+2) : > > > > > > > > pg 37.9c is incomplete, acting [32,50,59,1,0,75] (reducing > > > > pool > > > > bkp-sb-raid6 min_size from 4 may help; search ceph.com/docs for > > > > 'incomplete') > > > > > > > > Of course, we can't reduce min_size from 4. > > > > > > > > And the full state : https://pastebin.com/zrwu5X0w > > > > > > > > So, IO are blocked, we can't access thoses damaged data. > > > > OSD blocks too : > > > > osds 32,68,69 have stuck requests > 4194.3 sec > > > > > > > > OSD 32 is the primary of this PG. > > > > And OSD 68 and 69 are for cache tiering. > > > > > > > > Any idea how can I fix that ? > > > > > > > > Thanks, > > > > > > > > Olivier > > > > > > > > > > > > _______________________________________________ > > > > ceph-users mailing list > > > > ceph-users@xxxxxxxxxxxxxx > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com