Re: Reduced data availability: 3 pgs inactive, 3 pgs down

Shain Miley <SMiley@xxxxxxx> · Sun, 13 Oct 2024 17:15:56 +0000

So I just learned that you can tell which pool a pgl belongs to based on the name…I didn’t realize that before now!

Anyway pool number ‘0’ is a cephfs pool that was used for testing several years ago.  This is not production data so I am no longer as concerned.

pool 0 'data' replicated size 2 min_size 1 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode warn last_change 1091615 flags nearfull min_write_recency_for_promote 1 stripe_width 0 application cephfs

At this point it might make more sense to just delete these pools and recreate them if we decide to test cephfs again in the future.

Thanks again for all your help,

Shain

From: Anthony D'Atri <anthony.datri@xxxxxxxxx>
Date: Sunday, October 13, 2024 at 12:59 PM
To: Shain Miley <SMiley@xxxxxxx>
Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx>
Subject: Re:  Reduced data availability: 3 pgs inactive, 3 pgs down
!-------------------------------------------------------------------|
  External Email - Use Caution

|-------------------------------------------------------------------!

>
> The majority of the pools have ‘replicated size 3 min_size 2’.
>
Groovy.

> I do see a few pools such as .rgw.control and a few others have ‘replicated size 3 min_size 1’.

Not a good way to run.  Set min_size to 2 after you get healthy.

> I am not using erasure encoding and none of the pools are set to ‘replicated size 3 min_size 3’.

Odd that you’re in this situation.   You might increase the retries in your crush rules.

You might also set min_ size temporarily to 1 on pool #0, which may let these PGs activate and recover, then immediately set back to 2, then investigate if all PGs now have a full acting set.   NB There is some risk here.

>
> Thank you,
>
> Shain
>
>
> From: Anthony D'Atri <anthony.datri@xxxxxxxxx>
> Date: Sunday, October 13, 2024 at 11:29 AM
> To: Shain Miley <SMiley@xxxxxxx>
> Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx>
> Subject: Re:  Reduced data availability: 3 pgs inactive, 3 pgs down
> !-------------------------------------------------------------------|
>  External Email - Use Caution
>
> |-------------------------------------------------------------------!
>
> When you get the cluster healthy, redeploy those Filestore OSDs as BlueStore.  Not before.
>
>
> Does you r pool have size=3, min_size=3?  Is this a replicated pool? Or EC 2,1?
>
> Don’t mark lost, there are things we can do.  I don’t want to suggest anything until you share the above info.
>
>> On Oct 13, 2024, at 10:00 AM, Shain Miley <SMiley@xxxxxxx> wrote:
>>
>> Hello,
>>
>> I am seeing the following information after reviewing ‘ceph health detail’:
>>
>> [WRN] PG_AVAILABILITY: Reduced data availability: 3 pgs inactive, 3 pgs down
>>
>>   pg 0.1a is down, acting [234,35]
>>
>>   pg 0.20 is down, acting [226,267]
>>
>>   pg 0.2f is down, acting [227,161]
>>
>>
>> When I query each of those pgs I see the following message on each of them:
>>
>> "peering_blocked_by": [
>>
>>               {
>>
>>                   "osd": 233,
>>
>>                   "current_lost_at": 0,
>>
>>                   "comment": "starting or marking this osd lost may let us proceed"
>>
>>               }
>>
>>
>> Osd.233 crashed a while ago and when I try to start it the log shows some sort of issue with the filesystem:
>>
>>
>> ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)
>>
>> 1: (()+0x12980) [0x7f2779617980]
>>
>> 2: (gsignal()+0xc7) [0x7f27782c9fb7]
>>
>> 3: (abort()+0x141) [0x7f27782cb921]
>>
>> 4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1b2) [0x556ebe773ddf]
>>
>> 5: (FileStore::_do_transaction(ceph::os::Transaction&, unsigned long, int, ThreadPool::TPHandle*, char const*)+0x62b3) [0x556ebebe2753]
>>
>> 6: (FileStore::_do_transactions(std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, unsigned long, ThreadPool::TPHandle*, char const*)+0x48) [0x556ebebe3f38]
>>
>> 7: (JournalingObjectStore::journal_replay(unsigned long)+0x105a) [0x556ebebfc56a]
>>
>> 8: (FileStore::mount()+0x438a) [0x556ebebda82a]
>>
>> 9: (OSD::init()+0x4d1) [0x556ebe80fdc1]
>>
>> 10: (main()+0x3f8c) [0x556ebe77ad2c]
>>
>> 11: (__libc_start_main()+0xe7) [0x7f27782acbf7]
>>
>> 12: (_start()+0x2a) [0x556ebe78fc4a]
>>
>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
>>
>>
>>
>>
>>
>> At this point I am thinking about either running an xfs repair on osd.233 and trying to see if I can get it back up (once the pgs are healthy again I would likey zap/readd or replace the drive).
>>
>>
>>
>> Another option it sounds like is to mark the osd as lost.
>>
>>
>>
>> I am just looking for advice on what exactly I should do next to try to minimize the chances of any data loss.
>>
>> Here is the query output for each of those pgs:
>> https://urldefense.com/v3/__https://pastebin.com/YbfnpZGC__;!!Iwwt!XTVUuKiQDmZ8ZXQP-pHoxFFWYAIntSqVBuXcigFVVWbYMtpJTcQeg4BzgQQxWSAhs1BKujMNHx4rDIhAStU$<https://urldefense.com/v3/__https:/pastebin.com/YbfnpZGC__;!!Iwwt!XTVUuKiQDmZ8ZXQP-pHoxFFWYAIntSqVBuXcigFVVWbYMtpJTcQeg4BzgQQxWSAhs1BKujMNHx4rDIhAStU$><https://urldefense.com/v3/__https:/pastebin.com/YbfnpZGC__;!!Iwwt!XTVUuKiQDmZ8ZXQP-pHoxFFWYAIntSqVBuXcigFVVWbYMtpJTcQeg4BzgQQxWSAhs1BKujMNHx4rDIhAStU$%3chttps:/urldefense.com/v3/__https:/pastebin.com/YbfnpZGC__;!!Iwwt!XTVUuKiQDmZ8ZXQP-pHoxFFWYAIntSqVBuXcigFVVWbYMtpJTcQeg4BzgQQxWSAhs1BKujMNHx4rDIhAStU$%3e>
>>
>>
>>
>> Thank you,
>>
>> Shain
>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx