On 10/15/2017 03:13 AM, Denes Dolhay
wrote:
Hello,
Could you include the monitors and the osds as well to your
clock skew test?
How did you create the osds? ceph-deploy osd create
osd1:/dev/sdX osd2:/dev/sdY osd3: /dev/sdZ ?
Some log from one of the osds would be great!
Kind regards,
Denes.
On 10/14/2017 07:39 PM, dE wrote:
On 10/14/2017 08:18 PM, David
Turner wrote:
What are the ownership permissions on your osd
folders? Clock skew cares about partial seconds.
It isn't the networking issue because your
cluster isn't stuck peering. I'm not sure if the creating
state happens in disk or in the cluster.
I attached 1TB disks to each osd.
cluster 8161c90e-dbd2-4491-acf8-74449bef916a
health HEALTH_ERR
clock skew detected on mon.1, mon.2
64 pgs are stuck inactive for more than
300 seconds
64 pgs stuck inactive
too few PGs per OSD (21 < min 30)
Monitor clock skew detected
monmap e1: 3 mons at {0=
10.247.103.139:8567/0,1=10.247.103.140:8567/0,2=10.247.103.141:8567/0}
election epoch 12, quorum 0,1,2 0,1,2
osdmap e10: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds
pgmap v38: 64 pgs, 1 pools, 0 bytes data, 0
objects
33963 MB used, 3037 GB / 3070 GB avail
64 creating
I dont seem to have any clock skews --
or i in {139..141}; do ssh $i date +%s; done
1507989554
1507989554
1507989554
ceph:root. I tried ceph:ceph, and also ran ceph-osd as root.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
The monitors and OSDs are in the same host.
The output of one of the OSDs (run directly on the terminal)
ceph-osd -i 0 -f -d --setuser ceph --setgroup ceph
starting osd.0 at :/0 osd_data /srv/ceph/osd
/srv/ceph/osd/osd_journal
2017-10-15 09:03:20.234260 7f49bdb00900 0 set uid:gid to
64045:64045 (ceph:ceph)
2017-10-15 09:03:20.234269 7f49bdb00900 0 ceph version 10.2.5
(c461ee19ecbc0c5c330aca20f7392c9a00730367), process ceph-osd, pid
1068
2017-10-15 09:03:20.234636 7f49bdb00900 0 pidfile_write: ignore
empty --pid-file
2017-10-15 09:03:20.247340 7f49bdb00900 0
filestore(/srv/ceph/osd) backend xfs (magic 0x58465342)
2017-10-15 09:03:20.247940 7f49bdb00900 0
genericfilestorebackend(/srv/ceph/osd) detect_features: FIEMAP
ioctl is disabled via 'filestore fiemap' config option
2017-10-15 09:03:20.247959 7f49bdb00900 0
genericfilestorebackend(/srv/ceph/osd) detect_features:
SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole'
config option
2017-10-15 09:03:20.247982 7f49bdb00900 0
genericfilestorebackend(/srv/ceph/osd) detect_features: splice is
supported
2017-10-15 09:03:20.248777 7f49bdb00900 0
genericfilestorebackend(/srv/ceph/osd) detect_features: syncfs(2)
syscall fully supported (by glibc and kernel)
2017-10-15 09:03:20.248820 7f49bdb00900 0
xfsfilestorebackend(/srv/ceph/osd) detect_feature: extsize is
disabled by conf
2017-10-15 09:03:20.249386 7f49bdb00900 1 leveldb: Recovering log
#5
2017-10-15 09:03:20.249420 7f49bdb00900 1 leveldb: Level-0 table
#7: started
2017-10-15 09:03:20.250334 7f49bdb00900 1 leveldb: Level-0 table
#7: 146 bytes OK
2017-10-15 09:03:20.252409 7f49bdb00900 1 leveldb: Delete type=0
#5
2017-10-15 09:03:20.252449 7f49bdb00900 1 leveldb: Delete type=3
#4
2017-10-15 09:03:20.252552 7f49bdb00900 0
filestore(/srv/ceph/osd) mount: enabling WRITEAHEAD journal mode:
checkpoint is not enabled
2017-10-15 09:03:20.252708 7f49bdb00900 -1 journal
FileJournal::_open: disabling aio for non-block journal. Use
journal_force_aio to force use of aio anyway
2017-10-15 09:03:20.252714 7f49bdb00900 1 journal _open
/srv/ceph/osd/osd_journal fd 17: 10737418240 bytes, block size
4096 bytes, directio = 1, aio = 0
2017-10-15 09:03:20.253053 7f49bdb00900 1 journal _open
/srv/ceph/osd/osd_journal fd 17: 10737418240 bytes, block size
4096 bytes, directio = 1, aio = 0
2017-10-15 09:03:20.255212 7f49bdb00900 1
filestore(/srv/ceph/osd) upgrade
2017-10-15 09:03:20.258680 7f49bdb00900 0 <cls>
cls/cephfs/cls_cephfs.cc:202: loading cephfs_size_scan
2017-10-15 09:03:20.259598 7f49bdb00900 0 <cls>
cls/hello/cls_hello.cc:305: loading cls_hello
2017-10-15 09:03:20.327155 7f49bdb00900 0 osd.0 0 crush map has
features 2199057072128, adjusting msgr requires for clients
2017-10-15 09:03:20.327167 7f49bdb00900 0 osd.0 0 crush map has
features 2199057072128 was 8705, adjusting msgr requires for mons
2017-10-15 09:03:20.327171 7f49bdb00900 0 osd.0 0 crush map has
features 2199057072128, adjusting msgr requires for osds
2017-10-15 09:03:20.327199 7f49bdb00900 0 osd.0 0 load_pgs
2017-10-15 09:03:20.327210 7f49bdb00900 0 osd.0 0 load_pgs opened
0 pgs
2017-10-15 09:03:20.327216 7f49bdb00900 0 osd.0 0 using 0 op
queue with priority op cut off at 64.
2017-10-15 09:03:20.331681 7f49bdb00900 -1 osd.0 0 log_to_monitors
{default=true}
2017-10-15 09:03:20.339963 7f49bdb00900 0 osd.0 0 done with init,
starting boot process
sh: 1: lsb_release: not found
2017-10-15 09:03:20.344114 7f49a25d3700 -1 lsb_release_parse -
pclose failed: (13) Permission denied
2017-10-15 09:03:20.420408 7f49ae759700 0 osd.0 6 crush map has
features 288232576282525696, adjusting msgr requires for clients
2017-10-15 09:03:20.420587 7f49ae759700 0 osd.0 6 crush map has
features 288232576282525696 was 2199057080833, adjusting msgr
requires for mons
2017-10-15 09:03:20.420596 7f49ae759700 0 osd.0 6 crush map has
features 288232576282525696, adjusting msgr requires for osds
The cluster was created from scratch. Steps for creating OSDs --
ceph osd crush tunables jewel
ceph osd create f0960666-ad75-11e7-abc4-cec278b6b50a 0
ceph osd create 0e6295bc-adab-11e7-abc4-cec278b6b50a 1
ceph osd create 0e629828-adab-11e7-abc4-cec278b6b50a 2
ceph-osd -i 0/1/2 --mkfs --osd-uuid
f0960666-ad75-11e7-abc4-cec278b6b50a/0e6295bc-adab-11e7-abc4-cec278b6b50a/0e629828-adab-11e7-abc4-cec278b6b50a
-f -d (for each OSD)
chown -R ceph /srv/ceph/osd/
Ceph was started with --
ceph-osd -i 0/1/2 -f -d --setuser ceph --setgroup ceph
I'm skipped out the authentication part since the same problem
occurs without cephx (set to none).
In the mean time luminous works great with the same setup.