Hi Ulrich, On Thu, 18 Jul 2019, Ulrich Weigand wrote: > Hello, > > we've been trying to get Ceph to work on IBM Z, a big-endian system, > and have been running into various serious issues relating to endian > conversion code. > > The main issue we've been seeing is that while the old-style > decode/encode machinery in include/encoding.h automatically > byte-swaps all integers on big-endian systems (to ensure the > serialized format is always little endian), the *new-style* > machinery in include/denc.h does not. > > This seemed confusing at first since there is quite a bit of code > there that appears intended to perform exactly that function, e.g. > > template<typename T> > struct ExtType<T, std::enable_if_t<std::is_same_v<T, int32_t> || > std::is_same_v<T, uint32_t>>> { > using type = __le32; > }; > > However, it turns out that at this point __le32 is actually > just an alias for __u32, so this whole machinery doesn't > really do anything at all. > > Looking at the old code in encoding.h, I notice that it works > similarly, but uses ceph_le32 instead of __le32. The former > is a C++ class that actually does perform byte-swap on access. > > Even more confusing, there is this code in include/types.h: > > // temporarily remap __le* to ceph_le* for benefit of shared > kernel/userland headers > #define __le16 ceph_le16 > #define __le32 ceph_le32 > #define __le64 ceph_le64 > #include "ceph_fs.h" > #include "ceph_frag.h" > #include "rbd_types.h" > #undef __le16 > #undef __le32 > #undef __le64 > > which --sometimes-- redefines __le32 as ceph_le32, but those > redefines are not active at the point denc.h is included. > > So it would appear that the usage of __le32 in denc.h is incorrect, > and this code should be using ceph_le32 instead. Is this right? > > But even so, grepping for __le32 throughout the code base shows > quite a bit of additional places where it is used, most of which > also appear to make the assumption that byte-swaps automatically > happen. In addition, there appear to be some places where e.g. > ceph_fs.h is included directly, without going via types.h -- and > in those places, we suddenly no longer get the byte-swaps ... > > > Now I was wondering whether the best way forward might be to just > have __le32 always be defined as ceph_le32 when compiling user > space code. But then I noticed that it used to be that way, > but that was deliberated changed by this commit back in 2010: > > commit 737b5043576153817a6b4195b292672585df10d3 > Author: Sage Weil <sage@xxxxxxxxxxxx> > Date: Fri May 7 13:45:00 2010 -0700 > > endian: simplify __le* type hackery > > Instead of preventing linux/types.h from being included, instead name > our types ceph_le*, and remap using #define _only_ when including the > shared kernel/userspace headers. > > > So I'm a bit at a loss to understand how all this is supposed to > be working. Any suggestions would be welcome -- we'd be willing > to implement whatever's needed, but would like some guidance as > to how the solution should look like ... I think the cleanest thing is to s/__le/ceph_le/ everywhere *except* msgr.h, rados.h, and ceph_fs.h. It seems like what happened was we mostly forgot to always use the ceph_ variant after 2010. That should resolve the issue, right? sage _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx