Thanks for the replies folks. This one was resolved, I wish I could tell you I know what I changed to fix it, but there were several undocumented changes to the deployment script I'm using whilst I was distracted by something else.. Tearing down and redeploying today seems to not be suffering from this particular issue. I do have a new thing though, less concerning. I'll start a new thread.. On Tue, 8 Jun 2021 at 12:48, Robert W. Eckert <rob@xxxxxxxxxxxxxxx> wrote: > When I had issues with the monitors, it was access on the monitor folder > under /var/lib/ceph/<guid of ceph installation>/mon.<servername>/store.db, > make sure it is owned by the ceph user. > > My issues originated from a hardware issue - the memory needed 1.3 v, but > the mother board was only reading 1.2 (The memory had the issue, the > firmware said 1.2v required, the sticker on the side said 1.3). So I had a > script that copied the store across and fixed the permissions. > > The other thing that helped a lot compared to the crash logs, was to edit > the unit.run and remove -rm parameter from the command. That lets you see > the podman logs using podman logs <container> it was a bit more detailed. > > When you do this, you will need to restore that afterwards, and clean up > the 'cid' and 'pid' files from /run/ceph-<guid>@mon.<server>.service-cid > and /run/ceph-<guid>@mon.<server>.service-pid > > My reference is from Redhat enterprise 8, so things may be a bit different > on ubuntu. > > If you get a message about the store.db files being off, its easiest to > stop the working node, copy them over , set the user id/group to ceph and > start things up. > > Rob > > -----Original Message----- > From: Phil Merricks <seffyroff@xxxxxxxxx> > Sent: Tuesday, June 8, 2021 3:18 PM > To: ceph-users <ceph-users@xxxxxxx> > Subject: Mon crash when client mounts CephFS > > Hey folks, > > I have deployed a 3 node dev cluster using cephadm. Deployment went > smoothly and all seems well. > > If I try to mount a CephFS from a client node, 2/3 mons crash however. > I've begun picking through the logs to see what I can see, but so far > other than seeing the crash in the log itself, it's unclear what the cause > of the crash is. > > Here's a log. <https://termbin.com/isaz>. You can see where the crash is > occurring around the line that begins with "Jun 08 18:56:04 okcomputer > podman[790987]:" > > I would welcome any advice on either what the cause may be, or how I can > advance the analysis of what's wrong. > > Best regards > > Phil > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an > email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx