Changing the failure domain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear all!

In our Hammer cluster we are planning to switch our failure domain from host to chassis. We have performed some simulations, and regardless of the settings we have used some slow requests have appeared all the time.

we had the the following settings:

 "osd_max_backfills": "1",
    "osd_backfill_full_ratio": "0.85",
    "osd_backfill_retry_interval": "10",
    "osd_backfill_scan_min": "1",
    "osd_backfill_scan_max": "4",
    "osd_kill_backfill_at": "0",
    "osd_debug_skip_full_check_in_backfill_reservation": "false",
    "osd_debug_reject_backfill_probability": "0",

   "osd_min_recovery_priority": "0",
    "osd_allow_recovery_below_min_size": "true",
    "osd_recovery_threads": "1",
    "osd_recovery_thread_timeout": "60",
    "osd_recovery_thread_suicide_timeout": "300",
    "osd_recovery_delay_start": "0",
    "osd_recovery_max_active": "1",
    "osd_recovery_max_single_start": "1",
    "osd_recovery_max_chunk": "8388608",
    "osd_recovery_forget_lost_objects": "false",
    "osd_recovery_op_priority": "1",
    "osd_recovery_op_warn_multiple": "16",


we have also tested it with the CFQ IO scheduler on the OSDs and the following params:
    "osd_disk_thread_ioprio_priority": "7"
    "osd_disk_thread_ioprio_class": "idle"

and the nodeep-scrub set.

Is there anything else to try? Is there a good way to switch from one kind of failure domain to an other without slow requests?

Thank you in advance for any suggestions.

Kind regards,
Laszlo


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux