Issue with journal on another drive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I am trying to set up a three-node ceph cluster. Each node is running RHEL 7.1 and has three 1TB HDD drives for OSDs (sdb, sdc, sdd) and an SSD partition (/dev/sda6) for the journal.

I zapped the HDDs and used the following to create OSDs:

# ceph-deploy --overwrite-conf osd create node:/dev/sdb:/dev/sda6
# ceph-deploy --overwrite-conf osd create node:/dev/sdc:/dev/sda6
# ceph-deploy --overwrite-conf osd create node:/dev/sdd:/dev/sda6

Didn't get any errors but some of the OSDs are not coming up on the nodes:

# ceph osd tree
# id    weight  type name       up/down reweight
-1      8.19    root default
-2      2.73            host osd-01
3       0.91                    osd.3   up      1
0       0.91                    osd.0   up      1
1       0.91                    osd.1   down    0
-3      2.73            host osd-02
4       0.91                    osd.4   up      1
2       0.91                    osd.2   down    0
7       0.91                    osd.7   down    0
-4      2.73            host osd-03
8       0.91                    osd.8   up      1
5       0.91                    osd.5   down    0
6       0.91                    osd.6   up      1

Cluster is not doing well:

# ceph -s
    cluster a1a1fa57-d9eb-4eb1-b0de-7729ce7eb10c
health HEALTH_WARN 1724 pgs degraded; 96 pgs incomplete; 2 pgs stale; 96 pgs stuck inactive; 2 pgs stuck stale; 2666 pgs stuck unclean; recovery 4/24 objects degraded (16.667%) monmap e1: 3 mons at {cntrl-01=10.10.103.21:6789/0,cntrl-02=10.10.103.22:6789/0,cntrl-03=10.10.103.23:6789/0}, election epoch 18, quorum 0,1,2 cntrl-01,cntrl-02,cntrl-03
     osdmap e345: 9 osds: 5 up, 5 in
      pgmap v16755: 4096 pgs, 2 pools, 12976 kB data, 8 objects
            385 MB used, 4654 GB / 4655 GB avail
            4/24 objects degraded (16.667%)
                  46 active
                 627 active+degraded+remapped
                1430 active+clean
                  52 incomplete
                1097 active+degraded
                 798 active+remapped
                   2 stale+active
                  44 remapped+incomplete

I see the following in the logs for the failed OSDs:

2015-07-13 13:58:39.562223 7fafeb12d7c0 0 ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7), process ceph-osd, pid 4906 2015-07-13 13:58:39.592437 7fafeb12d7c0 0 filestore(/var/lib/ceph/osd/ceph-7) mount detected xfs (libxfs) 2015-07-13 13:58:39.592447 7fafeb12d7c0 1 filestore(/var/lib/ceph/osd/ceph-7) disabling 'filestore replica fadvise' due to known issues with fadvise(DONTNEED) on xfs 2015-07-13 13:58:39.635624 7fafeb12d7c0 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-7) detect_features: FIEMAP ioctl is supported and appears to work 2015-07-13 13:58:39.635633 7fafeb12d7c0 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-7) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2015-07-13 13:58:39.643786 7fafeb12d7c0 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-7) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2015-07-13 13:58:39.643838 7fafeb12d7c0 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-7) detect_feature: extsize is disabled by conf 2015-07-13 13:58:39.792118 7fafeb12d7c0 0 filestore(/var/lib/ceph/osd/ceph-7) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2015-07-13 13:58:40.064871 7fafeb12d7c0 1 journal _open /var/lib/ceph/osd/ceph-7/journal fd 20: 131080388608 bytes, block size 4096 bytes, directio = 1, aio = 1 2015-07-13 13:58:40.064897 7fafeb12d7c0 -1 journal FileJournal::open: ondisk fsid 60436b03-ece2-4709-a847-cf46ae9d7481 doesn't match expected 1d4e4290-0e91-4f53-a477-bfc09990ef72, invalid (someone else's?) journal 2015-07-13 13:58:40.064928 7fafeb12d7c0 -1 filestore(/var/lib/ceph/osd/ceph-7) mount failed to open journal /var/lib/ceph/osd/ceph-7/journal: (22) Invalid argument 2015-07-13 13:58:40.073118 7fafeb12d7c0 -1 ESC[0;31m ** ERROR: error converting store /var/lib/ceph/osd/ceph-7: (22) Invalid argument

Is there something that needed to be done to journal partition to enable sharing between multiple OSDs? Or is there something else that's causing the isssue?

Thanks.

--
Rimma

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux