Re: ceph-fs crashes on getfattr

Gregory Farnum <gfarnum@xxxxxxxxxx> · Tue, 12 Jul 2022 15:51:11 -0700

On Tue, Jul 12, 2022 at 1:46 PM Andras Pataki
<apataki@xxxxxxxxxxxxxxxxxxxxx> wrote:
>
> We also had a full MDS crash a couple of weeks ago due to what seems to
> be another back-ported feature going wrong.  As soon as I deployed the
> 16.2.9 ceph-fuse client, someone asked for an unknown ceph xattr, which
> crashed the MDS and kept crashing all the standby MDS's as well as they
> got activated.
>
> The reason seems to be that a new message is sent by the 16.2.9
> ceph-fuse (and NOT by the 16.2.7 one) to the MDS for unknown ceph xattr
> (now called vxattr).  The MDS code has a couple of switch statements by
> message type that do an abort when a message is received that the MDS
> does not expect, which is exactly what happened.  Then, as the MDS
> crashed, a standby got activated, the clients got to replay their
> pending requests - and ... the next MDS crashed due to the same message
> - bringing the whole file system down.Finally I found the culprit client
> - killed it, which got the cluster back to working.  I ended up patching
> the 16.2.9 client not to send this message and deployed that change quickly.
>
> My question is - why do we backport significant new features to stable
> releases?  Especially ones that change the messaging API?  I absolutely
> not expect such a crash when updating a point release of a client.

Generally, backports like this occur because it fixes a problem which
is visible to users and is deemed important enough to be worth the
risk. But obviously this one went quite badly wrong — we're going to
do an RCA and figure out what process changes we need to make to
prevent this kind of thing in future.

(Also, the abort-on-unknown-message behavior is going to get removed.
That's not appropriate behavior at this stage in Ceph's life.
https://tracker.ceph.com/issues/56522)
-Greg

>
> Andras
>
> On 7/12/22 07:01, Frank Schilder wrote:
> > Hi Gregory,
> >
> > Thanks for your fast reply. I created https://urldefense.com/v3/__https://tracker.ceph.com/issues/56529__;!!DSb-azq1wVFtOg!Rp8Drbtzjblq3DeEE3NuOkr3ERgUTUvU7IZekfMqL7bAlq6UREUe8Dlk2Q46rssYChOsfegNZAOvNBRSDLR_$  and attached the standard logs. In case you need more, please let me know. Note that I added some more buggy behaviour, the vxattr handling seems broken more or less all the way around.
> >
> > Best regards,
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > ________________________________________
> > From: Gregory Farnum <gfarnum@xxxxxxxxxx>
> > Sent: 11 July 2022 19:14:26
> > To: Frank Schilder
> > Cc: ceph-users@xxxxxxx
> > Subject: Re:  ceph-fs crashes on getfattr
> >
> > On Mon, Jul 11, 2022 at 8:26 AM Frank Schilder <frans@xxxxxx> wrote:
> >> Hi all,
> >>
> >> we made a very weird observation on our ceph test cluster today. A simple getfattr with a misspelled attribute name sends the MDS cluster into a crash+restart loop. Something as simple as
> >>
> >>    getfattr -n ceph.dir.layout.po /mnt/cephfs
> >>
> >> kills a ceph-fs completely. The problem can be resolved if one executes a "umount -f /mnt/cephfs" on the host where the getfattr was executed. The MDS daemons need a restart. One might also need to clear the OSD blacklist.
> >>
> >> We observe this with a kernel client on 5.18.6-1.el7.elrepo.x86_64 (Centos 7) with mimic and I'm absolutely sure I have not seen this problem with mimic on earlier 5.9.X-kernel versions.
> >>
> >> Is this known to be a kernel client bug? Possibly fixed already?
> > That obviously shouldn't happen. Please file a tracker ticket.
> >
> > There's been a fair bit of churn in how we handle the "vxattrs" so my
> > guess is an incompatibility got introduced between newer clients and
> > the old server implementation, but obviously we want it to work and we
> > especially shouldn't be crashing the MDS. Skimming through it I'm
> > actually not seeing what a client *could* do in that path to crash the
> > server so I'm a bit confused...
> > Oh. I think I see it now, but I'd like to confirm. Yeah, please make
> > that tracker ticket and attach the backtrace you get.
> > Thanks,
> > -Greg
> >
> >> Best regards,
> >> =================
> >> Frank Schilder
> >> AIT Risø Campus
> >> Bygning 109, rum S14
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx