Re: Help with file system with failed mds daemon

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



All sounds right to me... looks like this is a little too bleeding edge for my taste!  I'll probably drop it at this point and just wait till we are actually on a 4.8 kernel before checking on status again.

Thanks for your help!
-Bryan

-----Original Message-----
From: John Spray [mailto:jspray@xxxxxxxxxx]
Sent: Tuesday, August 22, 2017 2:56 PM
To: Bryan Banister <bbanister@xxxxxxxxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re:  Help with file system with failed mds daemon

Note: External Email
-------------------------------------------------

On Tue, Aug 22, 2017 at 8:49 PM, Bryan Banister
<bbanister@xxxxxxxxxxxxxxx> wrote:
> Hi John,
>
>
>
> Seems like you're right... strange that it seemed to work with only one mds
> before I shut the cluster down.  Here is the `ceph fs get` output for the
> two file systems:
>
>
>
> [root@carf-ceph-osd15 ~]# ceph fs get carf_ceph_kube01
>
> Filesystem 'carf_ceph_kube01' (2)
>
> fs_name carf_ceph_kube01
>
> epoch   22
>
> flags   8
>
> created 2017-08-21 12:10:57.948579
>
> modified        2017-08-21 12:10:57.948579
>
> tableserver     0
>
> root    0
>
> session_timeout 60
>
> session_autoclose       300
>
> max_file_size   1099511627776
>
> last_failure    0
>
> last_failure_osd_epoch  1218
>
> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
> uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
>
> max_mds 1
>
> in      0
>
> up      {}
>
> failed  0
>
> damaged
>
> stopped
>
> data_pools      [23]
>
> metadata_pool   24
>
> inline_data     disabled
>
> balancer
>
> standby_count_wanted    0
>
> [root@carf-ceph-osd15 ~]#
>
> [root@carf-ceph-osd15 ~]# ceph fs get carf_ceph02
>
> Filesystem 'carf_ceph02' (1)
>
> fs_name carf_ceph02
>
> epoch   26
>
> flags   8
>
> created 2017-08-18 14:20:50.152054
>
> modified        2017-08-18 14:20:50.152054
>
> tableserver     0
>
> root    0
>
> session_timeout 60
>
> session_autoclose       300
>
> max_file_size   1099511627776
>
> last_failure    0
>
> last_failure_osd_epoch  1198
>
> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
> uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
>
> max_mds 1
>
> in      0
>
> up      {0=474299}
>
> failed
>
> damaged
>
> stopped
>
> data_pools      [21]
>
> metadata_pool   22
>
> inline_data     disabled
>
> balancer
>
> standby_count_wanted    0
>
> 474299: 7.128.13.69:6800/304042158 'carf-ceph-osd15' mds.0.23 up:active seq
> 5

In that instance, it's not complaining because one of the filesystems
has never had an MDS.


> I also looked into trying to specify the mds_namespace option to the mount
> operation (http://docs.ceph.com/docs/master/cephfs/kernel/) but that doesn’t
> seem to be valid:
>
> [ceph-admin@carf-ceph-osd04 ~]$ sudo mount -t ceph carf-ceph-osd15:6789:/
> /mnt/carf_ceph02/ -o
> mds_namespace=carf_ceph02,name=cephfs.k8test,secretfile=k8test.secret
>
> mount error 22 = Invalid argument

It's likely that you are using an older kernel that doesn't have
support for the feature.  It was added in linux 4.8.

John

>
>
> Thanks,
>
> -Bryan
>
>
>
> -----Original Message-----
> From: John Spray [mailto:jspray@xxxxxxxxxx]
> Sent: Tuesday, August 22, 2017 11:18 AM
> To: Bryan Banister <bbanister@xxxxxxxxxxxxxxx>
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Help with file system with failed mds daemon
>
>
>
> Note: External Email
>
> -------------------------------------------------
>
>
>
> On Tue, Aug 22, 2017 at 4:58 PM, Bryan Banister
>
> <bbanister@xxxxxxxxxxxxxxx> wrote:
>
>> Hi all,
>
>>
>
>>
>
>>
>
>> I’m still new to ceph and cephfs.  Trying out the multi-fs configuration
>> on
>
>> at Luminous test cluster.  I shutdown the cluster to do an upgrade and
>> when
>
>> I brought the cluster back up I now have a warnings that one of the file
>
>> systems has a failed mds daemon:
>
>>
>
>>
>
>>
>
>> 2017-08-21 17:00:00.000081 mon.carf-ceph-osd15 [WRN] overall HEALTH_WARN 1
>
>> filesystem is degraded; 1 filesystem is have a failed mds daemon; 1 pools
>
>> have many more objects per pg than average; application not enabled on 9
>
>> pool(s)
>
>>
>
>>
>
>>
>
>> I tried restarting the mds service on the system and it doesn’t seem to
>
>> indicate any problems:
>
>>
>
>> 2017-08-21 16:13:40.979449 7fffed8b0700  1 mds.0.20 shutdown: shutting
>> down
>
>> rank 0
>
>>
>
>> 2017-08-21 16:13:41.012167 7ffff7fde1c0  0 set uid:gid to 167:167
>
>> (ceph:ceph)
>
>>
>
>> 2017-08-21 16:13:41.012180 7ffff7fde1c0  0 ceph version 12.1.4
>
>> (a5f84b37668fc8e03165aaf5cbb380c78e4deba4) luminous (rc), process
>> (unknown),
>
>> pid 16656
>
>>
>
>> 2017-08-21 16:13:41.014105 7ffff7fde1c0  0 pidfile_write: ignore empty
>
>> --pid-file
>
>>
>
>> 2017-08-21 16:13:45.541442 7ffff10b7700  1 mds.0.23 handle_mds_map i am
>> now
>
>> mds.0.23
>
>>
>
>> 2017-08-21 16:13:45.541449 7ffff10b7700  1 mds.0.23 handle_mds_map state
>
>> change up:boot --> up:replay
>
>>
>
>> 2017-08-21 16:13:45.541459 7ffff10b7700  1 mds.0.23 replay_start
>
>>
>
>> 2017-08-21 16:13:45.541466 7ffff10b7700  1 mds.0.23  recovery set is
>
>>
>
>> 2017-08-21 16:13:45.541475 7ffff10b7700  1 mds.0.23  waiting for osdmap
>> 1198
>
>> (which blacklists prior instance)
>
>>
>
>> 2017-08-21 16:13:45.565779 7fffea8aa700  0 mds.0.cache creating system
>> inode
>
>> with ino:0x100
>
>>
>
>> 2017-08-21 16:13:45.565920 7fffea8aa700  0 mds.0.cache creating system
>> inode
>
>> with ino:0x1
>
>>
>
>> 2017-08-21 16:13:45.571747 7fffe98a8700  1 mds.0.23 replay_done
>
>>
>
>> 2017-08-21 16:13:45.571751 7fffe98a8700  1 mds.0.23 making mds journal
>
>> writeable
>
>>
>
>> 2017-08-21 16:13:46.542148 7ffff10b7700  1 mds.0.23 handle_mds_map i am
>> now
>
>> mds.0.23
>
>>
>
>> 2017-08-21 16:13:46.542149 7ffff10b7700  1 mds.0.23 handle_mds_map state
>
>> change up:replay --> up:reconnect
>
>>
>
>> 2017-08-21 16:13:46.542158 7ffff10b7700  1 mds.0.23 reconnect_start
>
>>
>
>> 2017-08-21 16:13:46.542161 7ffff10b7700  1 mds.0.23 reopen_log
>
>>
>
>> 2017-08-21 16:13:46.542171 7ffff10b7700  1 mds.0.23 reconnect_done
>
>>
>
>> 2017-08-21 16:13:47.543612 7ffff10b7700  1 mds.0.23 handle_mds_map i am
>> now
>
>> mds.0.23
>
>>
>
>> 2017-08-21 16:13:47.543616 7ffff10b7700  1 mds.0.23 handle_mds_map state
>
>> change up:reconnect --> up:rejoin
>
>>
>
>> 2017-08-21 16:13:47.543623 7ffff10b7700  1 mds.0.23 rejoin_start
>
>>
>
>> 2017-08-21 16:13:47.543638 7ffff10b7700  1 mds.0.23 rejoin_joint_start
>
>>
>
>> 2017-08-21 16:13:47.543666 7ffff10b7700  1 mds.0.23 rejoin_done
>
>>
>
>> 2017-08-21 16:13:48.544768 7ffff10b7700  1 mds.0.23 handle_mds_map i am
>> now
>
>> mds.0.23
>
>>
>
>> 2017-08-21 16:13:48.544771 7ffff10b7700  1 mds.0.23 handle_mds_map state
>
>> change up:rejoin --> up:active
>
>>
>
>> 2017-08-21 16:13:48.544779 7ffff10b7700  1 mds.0.23 recovery_done --
>
>> successful recovery!
>
>>
>
>> 2017-08-21 16:13:48.544924 7ffff10b7700  1 mds.0.23 active_start
>
>>
>
>> 2017-08-21 16:13:48.544954 7ffff10b7700  1 mds.0.23 cluster recovered.
>
>>
>
>>
>
>>
>
>> This seems like an easy problem to fix.  Any help is greatly appreciated!
>
>
>
> I wonder if you have two filesystems but only one MDS?  Ceph will then
>
> think that the second filesystem "has a failed MDS" because there
>
> isn't an MDS online to service it.
>
>
>
> John
>
>
>
>>
>
>> -Bryan
>
>>
>
>>
>
>> ________________________________
>
>>
>
>> Note: This email is for the confidential use of the named addressee(s)
>> only
>
>> and may contain proprietary, confidential or privileged information. If
>> you
>
>> are not the intended recipient, you are hereby notified that any review,
>
>> dissemination or copying of this email is strictly prohibited, and to
>> please
>
>> notify the sender immediately and destroy this email and any attachments.
>
>> Email transmission cannot be guaranteed to be secure or error-free. The
>
>> Company, therefore, does not make any guarantees as to the completeness or
>
>> accuracy of this email or any attachments. This email is for informational
>
>> purposes only and does not constitute a recommendation, offer, request or
>
>> solicitation of any kind to buy, sell, subscribe, redeem or perform any
>> type
>
>> of transaction of a financial product.
>
>>
>
>> _______________________________________________
>
>> ceph-users mailing list
>
>> ceph-users@xxxxxxxxxxxxxx
>
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>>
>
>
> ________________________________
>
> Note: This email is for the confidential use of the named addressee(s) only
> and may contain proprietary, confidential or privileged information. If you
> are not the intended recipient, you are hereby notified that any review,
> dissemination or copying of this email is strictly prohibited, and to please
> notify the sender immediately and destroy this email and any attachments.
> Email transmission cannot be guaranteed to be secure or error-free. The
> Company, therefore, does not make any guarantees as to the completeness or
> accuracy of this email or any attachments. This email is for informational
> purposes only and does not constitute a recommendation, offer, request or
> solicitation of any kind to buy, sell, subscribe, redeem or perform any type
> of transaction of a financial product.

________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux