Re: Urgent help needed please - MDS offline

Patrick Donnelly <pdonnell@xxxxxxxxxx> · Fri, 23 Oct 2020 14:28:24 -0700

On Fri, Oct 23, 2020 at 9:02 AM David C <dcsysengineer@xxxxxxxxx> wrote:
>
> Success!
>
> I remembered I had a server I'd taken out of the cluster to
> investigate some issues, that had some good quality 800GB Intel DC
> SSDs, dedicated an entire drive to swap, tuned up min_free_kbytes,
> added an MDS to that server and let it run. Took 3 - 4 hours but
> eventually came back online. It used the 128GB of RAM and about 250GB
> of the swap.
>
> Dan, thanks so much for steering me down this path, I would have more
> than likely started hacking away at the journal otherwise!
>
> Frank, thanks for pointing me towards that other thread, I used your
> min_free_kbytes tip
>
> I now need to consider updating - I wonder if the risk averse CephFS
> operator would go for the latest Nautilus or latest Octopus, it used
> to be that the newer CephFS code meant the most stable but don't know
> if that's still the case.

You need to first upgrade to Nautilus in any case. n+2 releases is the
max delta between upgrades.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx