Re: Full OSD's on cephfs_metadata pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Robert,

Sorry to hear that this impacted you but I feel a bit better that I
wasn't alone.  Did you have a lot of log segments to trim on the MDSs
when you recovered?  I would agree that this was a very odd sudden onset
of space consumption for us.  We have usually like 600GB consumed of
around 8.5TB available NVMe space until the issue started and we were at
maximum capacity all the sudden.

I could explain this if I understood that when the MDS is behind on
trimming it lands the log segments in the metadata pool.  If we got so
far behind it could have just filled up the pool.

Thanks,
derek

On 3/19/20 7:50 AM, Robert Ruge wrote:
> Thanks Igor. I found that thread in my mailbox a few hours into the episode and it saved the day. I managed to get 6 of the 8 OSD's up which was enough to get the 10 missing pg's online and transitioned back onto hdd.
> 
> However I also appear to have killed two of the OSD's through maybe using inappropriate ssd's.
> 
> There was no warning from the cluster that those OSD's  were getting full unless some unusual event caused them to fill overnight.
> 
> I don't have enough nvme to support this model of operation so I will need to live with hdd's for a bit longer.
> 
> Regards
> Robert
> 
> 
> Regards
> Robert
> ________________________________
> From: Igor Fedotov <ifedotov@xxxxxxx>
> Sent: Thursday, March 19, 2020 10:15:46 PM
> To: Robert Ruge <robert.ruge@xxxxxxxxxxxxx>; ceph-users@xxxxxxx <ceph-users@xxxxxxx>
> Subject: Re:  Full OSD's on cephfs_metadata pool
> 
> Hi Robert,
> 
> there was a thread named "bluefs enospc" a couple day ago where Derek
> shared steps to bring in a standalone DB volume and get rid of "enospc"
> error.
> 
> 
> I'm currently working on a fix which hopefully will allow to recover
> from this failure but it might take some time before it lands to Nautilus.
> 
> 
> Thanks,
> 
> Igor
> 
> On 3/19/2020 6:10 AM, Robert Ruge wrote:
>> Hi All.
>>
>> Nautilus 14.2.8.
>>
>> I came in this morning to find that six of my eight NVME OSD's that were housing the cephfs_metadata pool had mysteriously filled up and crashed overnight and they won't come back up. These OSD's are all single logical volume devices with no separate WAL or DB.
>> I have tried extending the LV of one of the OSD's but it can't make use of it and I have added a separate db volume but that didn't help.
>> In the meantime I have told the cluster to move cephfs_metadata back to HDD which it has kindly done and emptied my two live OSD's but I am left with 10 pgs inactive.
>>
>> BLUEFS_SPILLOVER BlueFS spillover detected on 6 OSD(s)
>>       osd.93 spilled over 521 MiB metadata from 'db' device (26 GiB used of 50 GiB) to slow device
>>       osd.95 spilled over 456 MiB metadata from 'db' device (26 GiB used of 50 GiB) to slow device
>>       osd.100 spilled over 2.1 GiB metadata from 'db' device (26 GiB used of 50 GiB) to slow device
>>       osd.107 spilled over 782 MiB metadata from 'db' device (26 GiB used of 50 GiB) to slow device
>>       osd.112 spilled over 1.3 GiB metadata from 'db' device (27 GiB used of 50 GiB) to slow device
>>       osd.115 spilled over 1.4 GiB metadata from 'db' device (27 GiB used of 50 GiB) to slow device
>> PG_AVAILABILITY Reduced data availability: 10 pgs inactive, 10 pgs down
>>      pg 2.4e is down, acting [60,6,120]
>>      pg 2.60 is down, acting [105,132,15]
>>      pg 2.61 is down, acting [8,13,112]
>>      pg 2.72 is down, acting [93,112,0]
>>      pg 2.9f is down, acting [117,1,35]
>>      pg 2.b9 is down, acting [95,25,6]
>>      pg 2.c3 is down, acting [97,139,5]
>>      pg 2.c6 is down, acting [95,7,127]
>>      pg 2.d1 is down, acting [36,107,17]
>>      pg 2.f4 is down, acting [23,117,138]
>>
>> Can I backup and recreate an OSD on a larger volume?
>> Can I remove a good pg from an offline OSD to remove some space?
>>
>> Ceph-bluestore-tool repair fails.
>> "bluefs enospc" seems to be the critical error.
>>
>> So currently my cephfs is unavailable so any help would be greatly appreciated.
>>
>> Regards
>> Robert Ruge
>>
>>
>> Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone.
>>
>> Deakin University does not warrant that this email and any attachments are error or virus free.
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 
> Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone.
> 
> Deakin University does not warrant that this email and any attachments are error or virus free.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 

-- 
Derek T. Yarnell
Director of Computing Facilities
University of Maryland
Institute for Advanced Computer Studies
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux