It's not about endians. It's just about the fact that some linux arches define the error code of EOLDSNAPC/ERETRY to a different number. On Thu, Mar 6, 2025 at 6:22 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > On Thu, Mar 6, 2025 at 6:13 AM David Howells <dhowells@xxxxxxxxxx> wrote: > > > > Venky Shankar <vshankar@xxxxxxxxxx> wrote: > > > > > > That's a good point, though there is no code on the client that can > > > > generate this error, I'm not convinced that this error can't be > > > > received from the OSD or the MDS. I would rather some MDS experts > > > > chime in, before taking any drastic measures. > > > > > > The OSDs could possibly return this to the client, so I don't think it > > > can be done away with. > > > > Okay... but then I think ceph has a bug in that you're assuming that the error > > codes on the wire are consistent between arches as mentioned with Alex. I > > think you need to interject a mapping table. > > Without looking at the kernel code, Ceph in general wraps all error > codes to a defined arch-neutral endianness for the wire protocol and > unwraps them into the architecture-native format when decoding. Is > that not happening here? It should happen transparently as part of the > network decoding, so when I look in fs/ceph/file.c the usage seems > fine to me, and I see include/linux/ceph/decode.h is full of functions > that specify "le" and translating that to the cpu, so it seems fine. > And yes, the OSD can return EOLDSNAPC if the client is out of date > (and certain other conditions are true). > -Greg >