On Tue, 30 Dec 2014 14:08:32 +1000 Lindsay Mathieson wrote: > On Tue, 30 Dec 2014 12:48:58 PM Christian Balzer wrote: > > > Looks like I misunderstood the purpose of the monitors, I presumed > > > they were just for monitoring node health. They do more than that? > > > > > > > > > > They keep the maps and the pgmap in particular is of course very busy. > > All that action is at: /var/lib/ceph/mon/<monitorname>/store.db/ . > > > > In addition monitors log like no tomorrow, also straining the OS > > storage. > > > Yikes! > > Did a quick check, root & data storage at under 10% usage - Phew! > The DB doesn't (shouldn't) grow out of bounds and the logs while chatty ought to be rotated. Your issue is IOPS, how busy those SSDs are more than anything. But even crappy SSDs should be just fine. Use a good monitoring tool like atop to watch how busy things are. And do that while running a normal rados bench like this from a client node: rados -p rbd bench 60 write -t 32 And again like this: rados -p rbd bench 60 write -t 32 -b 4096 In particular (but not only), compare the CPU usage during those runs. > Could the third under spec'd monitor (which only has 1GB Eth) be slowing > things down? worthwhile removing it as a test? Check with atop, but I doubt it. The network should be fine, storage on SSD should be fine, the memory (if not doing anything else) should do for your cluster size. CPU probably as well, but that is for you to check. Also the primary monitor is the one with the lowest IP (unfortunately not documented anywhere or configurable). Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com