On Fri, May 18, 2012 at 3:07 AM, Vladimir Bashkirtsev <vladimir@xxxxxxxxxxxxxxx> wrote: > On 16/05/12 02:43, Gregory Farnum wrote: >> >> On Sun, May 6, 2012 at 5:53 PM, Vladimir Bashkirtsev >> <vladimir@xxxxxxxxxxxxxxx> wrote: >>> >>> On 03/05/12 16:23, Greg Farnum wrote: >>>> >>>> On Wednesday, May 2, 2012 at 11:24 PM, Vladimir Bashkirtsev wrote: >>>>> >>>>> Greg, >>>>> >>>>> Apologies for multiple emails: my mail server is backed by ceph now and >>>>> it struggled this morning (separate issue). So my mail server reported >>>>> back to my mailer that sending of email failed when obviously it was >>>>> not >>>>> the case. >>>> >>>> Interesting — I presume you're using the file system? That's not >>>> something >>>> we've heard of anybody doing with Ceph before. :) >>>> >>>>> [root@gamma ~]# ceph -s >>>>> 2012-05-03 15:46:55.640951 mds e2666: 1/1/1 up {0=1=up:active}, 1 >>>>> up:standby >>>>> 2012-05-03 15:46:55.647106 osd e10728: 6 osds: 6 up, 6 in >>>>> 2012-05-03 15:46:55.654052 log 2012-05-03 15:46:26.557084 mon.2 >>>>> 172.16.64.202:6789/0 2878 : [INF] mon.2 calling new monitor election >>>>> 2012-05-03 15:46:55.654425 mon e7: 3 mons at >>>>> {0=172.16.64.200:6789/0,1=172.16.64.201:6789/0,2=172.16.64.202:6789/0} >>>>> 2012-05-03 15:46:56.961624 pg v1251669: 600 pgs: 2 creating, 598 >>>>> active+clean; 309 GB data, 963 GB used, 1098 GB / 2145 GB avail >>>>> >>>>> Loggin is on but nothing obvious in there: logs quite small. Number of >>>>> ceph health logged (ceph monitored by nagios and so this record appears >>>>> every 5 minutes), monitors periodically call for election (different >>>>> periods between 1 to 15 minutes as it looks). That's it. >>>> >>>> Hrm. Generally speaking the monitors shouldn't call for elections unless >>>> something changes (one of them crashes) or the leader monitor is slowing >>>> down. >>>> Can you increase the debug_mon to 20, the debug_ms to 1, and post one of >>>> the logs somewhere? The "Live Debugging" section of >>>> http://ceph.com/wiki/Debugging should give you what you need. :) >>> >>> Here's the logs and core dumps: >>> http://www.bashkirtsev.com/logs-2012-05-07.tar.bz2 >>> >>> Mons grown to 1.2GB and 2GB of memory. >> >> When I look at the logs for mon.0, I see that there are a lot of >> places where mon.0 takes tens of seconds to write something to disk. >> If the disk is just about full, that might make sense (many >> filesystems don't handle a nearly-full disk very well at all); and a >> monitor getting stuck for that long could definitely explain why they >> start using up so much memory (they're buffering messages). I suspect >> that there's not anything particularly wrong here, unless I'm >> misunderstanding the story you're telling me. :) Have you noticed this >> problem when the monitor's disk partition isn't nearly full? >> -Greg > > I have recreated conditions when mon started to consume more memory: > everything appears in line with your suspicions. When disk gets almost full, > mon slows down and finally crashes quite badly so I cannot recover it. I am > forced then to destroy mon all together and create a new one instead. > > Long story short: in docs/wiki it should be stated as recommendation NOT to > keep monfs on the same partition as ceph log (which can grow quickly) and > preferably keep it on separate partition all together. Patches and edits welcome! :) > In the same time it begs another question: what it recommended partition > size for monfs? I'm looking at a cluster about a month old with a 765MB mon data directory. Most of that (~500 MB) is in the log files, which can be trimmed manually, and I believe that everything else taking up data trims itself when stuff's working. So if you're willing to set up a pseudo-log rotation (or do it yourself on a timer every month or so) a couple GB should leave you plenty of breathing room. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html