On Thu, Apr 30, 2020 at 3:03 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Wed, 2020-04-29 at 18:08 +0200, Ilya Dryomov wrote: > > On Wed, Apr 29, 2020 at 5:42 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > On Wed, 2020-04-29 at 11:46 +0200, Eduard Shishkin wrote: > > > > On 4/28/20 2:23 PM, Jeff Layton wrote: > > > > > On Mon, 2020-04-27 at 23:46 +0200, edward6@xxxxxxxxxxxxx wrote: > > > > > > From: Eduard Shishkin <edward6@xxxxxxxxxxxxx> > > > > > > > > > > > > In the function handle_session() variable @features always > > > > > > contains little endian order of bytes. Just because the feature > > > > > > bits are packed bytewise from left to right in > > > > > > encode_supported_features(). > > > > > > > > > > > > However, test_bit(), called to check features availability, assumes > > > > > > the host order of bytes in that variable. This leads to problems on > > > > > > big endian architectures. Specifically it is impossible to mount > > > > > > ceph volume on s390. > > > > > > > > > > > > This patch adds conversion from little endian to the host order > > > > > > of bytes, thus fixing the problem. > > > > > > > > > > > > Signed-off-by: Eduard Shishkin <edward6@xxxxxxxxxxxxx> > > > > > > --- > > > > > > fs/ceph/mds_client.c | 4 ++-- > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > > > > > > index 486f91f..190598d 100644 > > > > > > --- a/fs/ceph/mds_client.c > > > > > > +++ b/fs/ceph/mds_client.c > > > > > > @@ -3252,7 +3252,7 @@ static void handle_session(struct ceph_mds_session *session, > > > > > > struct ceph_mds_session_head *h; > > > > > > u32 op; > > > > > > u64 seq; > > > > > > - unsigned long features = 0; > > > > > > + __le64 features = 0; > > > > > > int wake = 0; > > > > > > bool blacklisted = false; > > > > > > > > > > > > @@ -3301,7 +3301,7 @@ static void handle_session(struct ceph_mds_session *session, > > > > > > if (session->s_state == CEPH_MDS_SESSION_RECONNECTING) > > > > > > pr_info("mds%d reconnect success\n", session->s_mds); > > > > > > session->s_state = CEPH_MDS_SESSION_OPEN; > > > > > > - session->s_features = features; > > > > > > + session->s_features = le64_to_cpu(features); > > > > > > renewed_caps(mdsc, session, 0); > > > > > > wake = 1; > > > > > > if (mdsc->stopping) > > > > > > > > > > (cc'ing Zheng since he did the original patches here) > > > > > > > > > > Thanks Eduard. The problem is real, but I think we can just do the > > > > > conversion during the decode. > > > > > > > > > > The feature mask words sent by the MDS are 64 bits, so if it's smaller > > > > > we can assume that it's malformed. So, I don't think we need to handle > > > > > the case where it's smaller than 8 bytes. > > > > > > > > > > How about this patch instead? > > > > > > > > Hi Jeff, > > > > > > > > This also works. Please, apply. > > > > > > > > Thanks, > > > > Eduard. > > > > > > > > > > Thanks. Merged into ceph-client/testing branch, and should make v5.8. > > > > I think this is stable material. I'll tag it and get it queued up for 5.7-rc. > > > > Thanks, > > > > Yeah, that sounds reasonable. > > If you're going to send up another PR, then we might want to add these > bugfixes currently in the testing branch to it as well: > > 445645c8be5f fs/ceph:fix special error code in ceph_try_get_caps() > 591681748b56 fs/ceph:fix double unlock in handle_cap_export() > 0e84a1ebe161 ceph: ceph_kick_flushing_caps needs the s_mutex > > I'm not sure that any of them need to go to stable though. We might also > want this one though: > > 7b3facb61440 ceph: reset i_requested_max_size if file write is not wanted > > ...but it'll probably need to be reworked due to merge conflicts if we > move it ahead of some of the cap handling cleanup patches (or we could > just pull those in too). > > Zheng, do you have an opinion here? Should 7b3facb61440 go to stable? (dropping IBM folks) "ceph: ceph_kick_flushing_caps needs the s_mutex" doesn't apply either because cap_dirty list was moved. I don't see the urgency on these TBH. The endianness fix and two error handling fixups from Wu are now queued up. Thanks, Ilya