On Thu, Mar 6, 2025 at 6:13 AM David Howells <dhowells@xxxxxxxxxx> wrote: > > Venky Shankar <vshankar@xxxxxxxxxx> wrote: > > > > That's a good point, though there is no code on the client that can > > > generate this error, I'm not convinced that this error can't be > > > received from the OSD or the MDS. I would rather some MDS experts > > > chime in, before taking any drastic measures. > > > > The OSDs could possibly return this to the client, so I don't think it > > can be done away with. > > Okay... but then I think ceph has a bug in that you're assuming that the error > codes on the wire are consistent between arches as mentioned with Alex. I > think you need to interject a mapping table. Without looking at the kernel code, Ceph in general wraps all error codes to a defined arch-neutral endianness for the wire protocol and unwraps them into the architecture-native format when decoding. Is that not happening here? It should happen transparently as part of the network decoding, so when I look in fs/ceph/file.c the usage seems fine to me, and I see include/linux/ceph/decode.h is full of functions that specify "le" and translating that to the cpu, so it seems fine. And yes, the OSD can return EOLDSNAPC if the client is out of date (and certain other conditions are true). -Greg