Re: Performance after adding a node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



WOW!!!  Those are some awfully high backfilling settings you have there.  They are 100% the reason that your customers think your system is down.  You're telling each OSD to be able to have 20 backfill operations running at the exact same time.  I bet if you were watching iostat -x 1 on one of your nodes before you inject those settings and then after you inject those settings, the disk usage will go from a decent amount of 40-70% and jump all the way up to 100% as soon as those settings are injected.

When you are backfilling, you are copying data from one drive to another.  Each osd-max-backfill you set it to is another file it tries to copy at the same time.  These can be receiving data (writing to the disk) or moving data off (reading from the disk followed by a delete).  So by having 20 backfills happening at a time, you are telling each disk to allow 20 files to be written and/or read from it at the same time.  What happens to a disk when you are copying 20 large files to it at a time?  all of them move slower (a lot to do with disk thrashing having 20 threads all reading and writing to different parts of the disk).

What you want to find is the point where your disks are usually around 80-90% utilized while backfilling, but not consistently 100%.  The easy way to do that is to increase your osd-max-backfills by 1 or 2 at a time until you see it go too high, and then back off.  I don't know many people that go above 5 max backfills in a production cluster on spinning disks.  Usually the ones that do, do it temporarily while they know their cluster isn't being utilized by customers much.

Personally I have used osd-recover-threads ands osd-recover-max-active, I've been able to tune my clusters only using osd-max-backfills.  The lower you leave these the longer the backfill will take, but the less impact your customers will notice.  I've found 3 to be a generally safe number if customer IO is your priority, 5 works well if your customers can be ok with it being slow (but still usable)... but all of this depends on your hardware and software use-cases.  Test it while watching your disk utilizations and test your application while finding the right number for your environment.

Good Luck :)

On Mon, May 8, 2017 at 5:43 PM Daniel Davidson <danield@xxxxxxxxxxxxxxxx> wrote:
Our ceph system performs very poorly or not even at all while the
remapping procedure is underway.  We are using replica 2 and the
following ceph tweaks while it is in process:

  1013  ceph tell osd.* injectargs '--osd-recovery-max-active 20'
  1014  ceph tell osd.* injectargs '--osd-recovery-threads 20'
  1015  ceph tell osd.* injectargs '--osd-max-backfills 20'
  1016  ceph -w
  1017  ceph osd set noscrub
  1018  ceph osd set nodeep-scrub

After the remapping finishes, we set these back to default.

Are any of these causing our problems or is there another way to limit
the impact of the remapping so that users do not think the system is
down while we add more storage?


thanks,

Dan

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux