Sage Weil <sage@xxxxxxxxxxxx> wrote on 18.07.2019 17:00:18:
> On Thu, 18 Jul 2019, Ulrich Weigand wrote:
> > That's what we've tried to do, and it does indeed make work Ceph a lot
> > better on IBM Z (still some issues, but probably unrelated). Once we've
> > gotten a bit farther with testing, we can certainly send patches to
> > that effect.
>
> Great, thanks!
I've now made significant progress. In fact, with my current patch
set I can build Ceph on Z and run the "ctest" suite with zero fails:
100% tests passed, 0 tests failed out of 169
In addition to the endian-related changes, which I'll describe in
detail below, I needed three more patches fixing minor testing
issues, as well as a critical fix for crashes in the Boost "lockfree"
library on Z.
I'll be preparing patches for submission via issues and pull requests,
but I've never contributed to Ceph before, so I may have to first get
more familar with how the project operates ... I'd appreciate any
comments or tips, e.g. on how to (possibly) split up the patch etc.
(See attached file: ceph-endian.diff)
This patch fixes many problems with Ceph on big-endian systems due to
missed endian conversions in various places. The primary change is:
- Remove all instances of __le16/__le32/__le64 and replace them with
ceph_le16/ceph_le32/ceph_le64 instead; the latter will perform
automatic byte swapping on access if required. (Note that in select
places I believe using a byte-swapping type is unnecessary, e.g.
for local variables or parameters; in those cases I've changed
existing __le type uses to plain unsigned integer types instead.)
- The exception is three header files (ceph_fs.h, msgr.h, rados.h),
which are shared with the Linux kernel. These files continue to
use the __le types, but redefine them to ceph_le types when
compiled for user-space. This is now done in those files themselves
instead of in types.h, so that also direct includers are covered.
In addition, I noticed inconsistencies with how __le and/or ceph_le
types were initialized. Given that ceph_le currently has a public
member, it can be initialized as an aggregate, which means the
responsibility for byte swapping is on each user. Some places did
indeed perform byte swapping (either via init_le16/32/64 or directly
via mswab), but not all.
To ensure consistency, I'm suggesting to change ceph_le to make its
member private, which will ensure any value must be set by the
assignment operator (which always byte-swaps). Ideally, we'd also
have a constructor; but that collides with the way the __le types
are used in the shared headers as part of anonymous structs/unions.
GCC does not allow any type with constructor in such places.
For convenience, I've therefore changed the init_le functions to
return ceph_le types, so that you can at least initialize ceph_le
variables to the return values of init_le. All places where a
ceph_le (or formerly __le) variable needs to be initialized are
changed to use this new method; all direct uses of mswab were
eliminated.
Also, in a very small number of places existing code actually
tried to manually byte-swap accesses to __le types using the
leXX_to_cpu routines (but that was a no-op anyway ...). With
the ceph_le types, that now works correctly automatically, so I've
removed that code.
In addition, the patch contains a number of changes to work around
follow-on problems from the decisions made above. Specifically:
- Now that ceph_fs.h always uses C++ types (in user space), it cannot
be included in C files any more. There was exactly one of those,
src/mds/locks.c, and it only used a few CEPH_CAP_... constants from
the header. To fix this, I've simply duplicated those definitions;
as those values appear to be unchangeable ABI constants, I'd not
expect that duplication to cause any problems later on. (Of course,
any alternative suggestions are appreciated!)
- Even in the absence of constructors, the new ceph_le types are now
no longer aggregate types; this causes a few issues when interacting
with compiler extensions, in particular anonymous structs/unions.
GCC no longer allows (without warning) the case where a packed struct
contains *non-packed* structs that contain ceph_le. In practice,
all those structs and substructs were always intended to be completely
packed (and actually are, those substructs where the attribute was
missing happen to be naturally packed), so I've simply added the
missing attributes.
- Similarly, GCC no longer allows a std::array to be an element of a
packed struct that also contains a non-aggregate type (like the new
ceph_le). This happens in one file, msg/async/frames_v2.h. I've
replaced the std::array with a plain C array with the same layout.
- Also, the following code now gives warnings due to overwriting a
private member via memcpy:
copy_from_legacy_head(struct ceph_mds_request_head *head,
struct ceph_mds_request_head_legacy *legacy) {
memcpy(&(head->oldest_client_tid), legacy, sizeof(*legacy));
I'm now simply using a struct assignment instead.
- Several places in os/Transaction.h use operators like += or ++
on (what is now) ceph_le members, but the class doesn't implement
those. I guess it would be possible to add them at the class level,
but since it's just the one file, and in general it seems preferable
to not perform too many operations on byte-swapped types anyway,
I've simply expanded those operations in place.
- Because ceph_le has no constructor, it is impossible to construct
a constexpr variable of any type containing ceph_le. This only
occurs in one test case, where I simply removed the constexpr.
Finally, I've run into three places where endian handling looked simply
wrong, completely independently of the __le vs. ceph_le issue:
- librbd/operation/ResizeRequest.cc:ResizeRequest<I>::send_update_header
has the comment:
// NOTE: format 1 image headers are not stored in fixed endian format
This seems just wrong, both from looking at test failures, and from
looking at code handling those format 1 image headers in the Linux
kernel driver (which uses le_to_cpu for those fields).
- struct PGTempMap in osd/OSDMap.h tracks a number of int32_t pointers
pointing into a buffer list. But that list was generated via encode,
which means int members are bytes-swapped. Fixed by using ceph_le32
pointers instead.
- Similarly, struct HeaderHelper in tools/immutable_object_cache/Types.h
is used to overlay buffer list data, so it should use ceph_le32
instead of uint32_t.
Bye,
Ulrich
Attachment:
ceph-endian.diff
Description: Binary data
_______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx