Re: Two issues remaining after luminous upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So after much trail and tribulation, I ended up solving both problems on my own. The first one I had to rejigger the pools to balance out the objects per pg average. This isn’t my preferred method because this could repeat again once the cluster gets loaded.

 

On the second, I ended up removing previous optimizations that seemed to clear up the issue of osds going. I’m still at a loss why these were causing osds to go down, and I’m wondering if someone that knows it better could explain it. Here are the removed options:

 

[root@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ~] # diff ceph.conf.old  ceph-conf/ceph.conf

12,13d11

< mon_osd_down_out_interval = 30

< mon_osd_report_timeout = 300

18d15

< osd_journal_size = 10000

24,28d20

< max_open_files = 131072

< osd_max_backfills = 2

< osd_recovery_max_active = 2

< osd_recovery_op_priority = 1

< osd_client_op_priority = 63

30,32d21

< ms_dispatch_throttle_bytes = 1048576000

< objecter_inflight_op_bytes = 1048576000

< osd_deep_scrub_stride=5242880

36c25

< mon_pg_warn_max_object_skew = 10

 

Thanks,

Matthew Stroud

 

From: Matthew Stroud <mattstroud@xxxxxxxxxxxxx>
Date: Thursday, January 25, 2018 at 3:15 PM
To: "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>
Subject: Two issues remaining after luminous upgrade

 

The first and hopefully easy one:

 

I have a situation where I have two pools that are rarely used (the third will be in use after I can get through these issues), but they need to present at the whims of our cloud team. Is there a way I can turn off ‘2 pools have many more objects per pg than average’?

 

What I have done to this point is played with ‘mon_pg_warn_max_object_skew’, but that didn’t remove the message. After googling and looking through the docs, nothing stood out to me to resolve the issue.

 

Technical info:

 

[root@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ~] # ceph health detail

HEALTH_WARN 2 pools have many more objects per pg than average

MANY_OBJECTS_PER_PG 2 pools have many more objects per pg than average

    pool images objects per pg (480) is more than 60 times cluster average (8)

    pool metrics objects per pg (336) is more than 42 times cluster average (8)

 

The second:

 

I’m seeing osds randomly getting marked as down, but then state they are still running. This issue wasn’t present before the upgrade. This is a multipath setup but the paths appear healthy and the cluster isn’t really being utilized at the moment. Please let me know if you want more information:

 

Ceph.log:

 

2018-01-25 14:56:29.011831 mon.mon01 mon.0 10.20.57.10:6789/0 823 : cluster [INF] osd.12 marked down after no beacon for 300.775605 seconds

2018-01-25 14:56:29.013280 mon.mon01 mon.0 10.20.57.10:6789/0 824 : cluster [WRN] Health check failed: 1 osds down (OSD_DOWN)

2018-01-25 14:56:32.034002 mon.mon01 mon.0 10.20.57.10:6789/0 830 : cluster [INF] Health check cleared: OSD_DOWN (was: 1 osds down)

2018-01-25 14:56:31.322228 osd.12 osd.12 10.20.57.14:6804/4163 1 : cluster [WRN] Monitor daemon marked osd.12 down, but it is still running

 

Ceph-osd.12.log:

 

2018-01-25 14:56:00.606493 7facfde03700  4 rocksdb: (Original Log Time 2018/01/25-14:56:00.602100) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/rocksdb/db/memtable_list.cc:360] [default] Level-0 commit table #213 started

2018-01-25 14:56:00.606498 7facfde03700  4 rocksdb: (Original Log Time 2018/01/25-14:56:00.606406) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/rocksdb/db/memtable_list.cc:383] [default] Level-0 commit table #213: memtable #1 done

2018-01-25 14:56:00.606517 7facfde03700  4 rocksdb: (Original Log Time 2018/01/25-14:56:00.606437) EVENT_LOG_v1 {"time_micros": 1516917360606429, "job": 29, "event": "flush_finished", "lsm_state": [2, 1, 1, 0, 0, 0, 0], "immutable_memtables": 0}

2018-01-25 14:56:00.606529 7facfde03700  4 rocksdb: (Original Log Time 2018/01/25-14:56:00.606466) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/rocksdb/db/db_impl_compaction_flush.cc:132] [default] Level summary: base level 1 max bytes base 268435456 files[2 1 1 0 0 0 0] max score 0.50

 

2018-01-25 14:56:00.606538 7facfde03700  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/rocksdb/db/db_impl_files.cc:388] [JOB 29] Try to delete WAL files size 252104127, prev total WAL file size 253684537, number of live WAL files 2.

 

2018-01-25 14:56:31.322223 7fad1262c700  0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.12 down, but it is still running

2018-01-25 14:56:31.322233 7fad1262c700  0 log_channel(cluster) log [DBG] : map e18531 wrongly marked me down at e18530

2018-01-25 14:56:31.322236 7fad1262c700  1 osd.12 18531 start_waiting_for_healthy

2018-01-25 14:56:31.327816 7fad0c620700  1 osd.12 pg_epoch: 18530 pg[14.8f( v 18432'17 (0'0,18432'17] local-lis/les=18521/18522 n=1 ec=18405/18405 lis/c 18521/18521 les/c/f 18522/18522/0 18530/18530/18530) [3,19] r=-1 lpr=18530 pi=[18521,18530)/1 luod=0'0 crt=18432'17 lcod 0'0 active] start_peering_interval up [12,3,19] -> [3,19], acting [12,3,19] -> [3,19], acting_primary 12 -> 3, up_primary 12 -> 3, role 0 -> -1, features acting 2305244844532236283 upacting 2305244844532236283

2018-01-25 14:56:31.327851 7fad0be1f700  1 osd.12 pg_epoch: 18530 pg[14.9e( empty local-lis/les=18522/18523 n=0 ec=18405/18405 lis/c 18522/18522 les/c/f 18523/18523/0 18530/18530/18530) [15,10] r=-1 lpr=18530 pi=[18522,18530)/1 crt=0'0 active] start_peering_interval up [12,15,10] -> [15,10], acting [12,15,10] -> [15,10], acting_primary 12 -> 15, up_primary 12 -> 15, role 0 -> -1, features acting 2305244844532236283 upacting 2305244844532236283

2018-01-25 14:56:31.327918 7fad0c620700  1 osd.12 pg_epoch: 18531 pg[14.8f( v 18432'17 (0'0,18432'17] local-lis/les=18521/18522 n=1 ec=18405/18405 lis/c 18521/18521 les/c/f 18522/18522/0 18530/18530/18530) [3,19] r=-1 lpr=18530 pi=[18521,18530)/1 crt=18432'17 lcod 0'0 unknown NOTIFY] state<Start>: transitioning to Stray

 

Ceph osd tree:

 

[root@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ceph-conf] # ceph osd tree

ID CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF

-1       31.99658 root default

-2        7.99915     host osd01

0   ssd  0.99989         osd.0      up  1.00000 1.00000

1   ssd  0.99989         osd.1      up  1.00000 1.00000

5   ssd  0.99989         osd.5      up  1.00000 1.00000

6   ssd  0.99989         osd.6      up  1.00000 1.00000

7   ssd  0.99989         osd.7      up  1.00000 1.00000

11   ssd  0.99989         osd.11     up  1.00000 1.00000

20   ssd  0.99989         osd.20     up  1.00000 1.00000

22   ssd  0.99989         osd.22     up  1.00000 1.00000

-3        7.99915     host osd02

12   ssd  0.99989         osd.12     up  1.00000 1.00000

18   ssd  0.99989         osd.18     up  1.00000 1.00000

23   ssd  0.99989         osd.23     up  1.00000 1.00000

26   ssd  0.99989         osd.26     up  1.00000 1.00000

27   ssd  0.99989         osd.27     up  1.00000 1.00000

28   ssd  0.99989         osd.28     up  1.00000 1.00000

29   ssd  0.99989         osd.29     up  1.00000 1.00000

30   ssd  0.99989         osd.30     up  1.00000 1.00000

-4        7.99915     host osd03

13   ssd  0.99989         osd.13     up  1.00000 1.00000

15   ssd  0.99989         osd.15     up  1.00000 1.00000

16   ssd  0.99989         osd.16     up  1.00000 1.00000

17   ssd  0.99989         osd.17     up  1.00000 1.00000

19   ssd  0.99989         osd.19     up  1.00000 1.00000

21   ssd  0.99989         osd.21     up  1.00000 1.00000

24   ssd  0.99989         osd.24     up  1.00000 1.00000

25   ssd  0.99989         osd.25     up  1.00000 1.00000

-5        7.99915     host osd04

2   ssd  0.99989         osd.2      up  1.00000 1.00000

3   ssd  0.99989         osd.3      up  1.00000 1.00000

4   ssd  0.99989         osd.4      up  1.00000 1.00000

8   ssd  0.99989         osd.8      up  1.00000 1.00000

9   ssd  0.99989         osd.9      up  1.00000 1.00000

10   ssd  0.99989         osd.10     up  1.00000 1.00000

14   ssd  0.99989         osd.14     up  1.00000 1.00000

31   ssd  0.99989         osd.31     up  1.00000 1.00000

 

Mon settings for down:

 

[root@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ceph-conf] # ceph --admin-daemon /var/run/ceph/ceph-mon.mon01.asok config show | grep -i down

    "mds_mon_shutdown_timeout": "5.000000",

    "mds_shutdown_check": "0",

    "mon_osd_adjust_down_out_interval": "true",

    "mon_osd_down_out_interval": "30",

    "mon_osd_down_out_subtree_limit": "rack",

    "mon_osd_min_down_reporters": "2",

    "mon_pg_check_down_all_threshold": "0.500000",

    "mon_warn_on_osd_down_out_interval_zero": "true",

    "osd_backoff_on_down": "true",

    "osd_debug_shutdown": "false",

    "osd_journal_flush_on_shutdown": "true",

    "osd_max_markdown_count": "5",

    "osd_max_markdown_period": "600",

    "osd_mon_shutdown_timeout": "5.000000",

    "osd_shutdown_pgref_assert": "false",

 

 




CONFIDENTIALITY NOTICE: This message is intended only for the use and review of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message solely to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify sender immediately by telephone or return email. Thank you.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux