On Fri, Jun 29, 2012 at 1:54 PM, Brian Edmonds <mornir@xxxxxxxxx> wrote: > On Fri, Jun 29, 2012 at 11:55 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >> So right now you're using the Ceph filesystem, rather than RBD, right? > > Right, CephFS. I'm actually not even very clear on what RBD is, and > how one might use it, but I'm sure I'll understand that in the > fullness of time. I came to Ceph from a background of wanting to > replace my primary RAID array with a RAIM (redundant array of > inexpensive machines) cluster, and a co-worker suggested Ceph as a > possibility. > >> What processes do you have running on which machines/VMs? What's the >> CPU usage on the ceph-mds process? > > I have four VMs running Debian testing, with a dom0 on a recent 6-core > AMD cpu (I forget which one). Each VM has two virtual cores, 1GB of > RAM, and a 500GB virtual disk partition formatted with btrfs, used for > both data and journal. These are somewhat smaller than recommended, > but in the right ballpark, and the filesystem has so far not been used > to store any significant amount of data. (Mostly just bonnie tests.) > > All four VMs are running OSD, the first three are running MON, and the > first two MDS. I mostly watch top on the first machine (if there's a > better tool for watching a cluster, please let me know), and it shows > the majority of the CPU time in wait, with the Ceph jobs popping up > from time to time with a fraction of a percent, sometimes up into > single digits. It's also not uncommon to see a lot of idle time. > When I get some time I'm going to wrap some sort of collector around > the log files and feed the data into OpenTSDB. Okay, there's two things I'd do here. First, create a cluster that only has one MDS — the multi-MDS system is significantly less stable. Second, you've got 3 monitors doing frequent fsyncs, and 4 OSDs doing frequent syncs, which are all funneling into a single disk. That's going to go poorly no matter what you're doing. ;) Try doing a smaller cluster with just one monitor, one or two OSDs, and one MDS. >> And a warning: the filesystem, while nifty, is not yet >> production-ready — it works great for some use cases but there are >> some serious known bugs that aren't very hard to trigger, as we've >> been doing a lot of QA on RADOS and its associated systems (which the >> filesystem depends on) at the expense of the filesystem itself. > > Good to know. For now I'm just playing, but I eventually want to have > a distributed filesystem that I can use. I'm curious to see how Ceph > does when deployed on real hardware, which I expect to have in the > next couple weeks. Very simple stuff compared to what I see others on > the list discussing: a few dual core Atom systems with 1TB of drive > and 4GB of RAM each, all on a 1Gb switch. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html