Just thought of one other thing; allow me to insert that below. On Fri, Nov 28, 2014 at 1:04 PM, Florian Haas <florian@xxxxxxxxxxx> wrote: > Hi everyone, > > I'd like to come back to a discussion from 2012 (thread at > http://marc.info/?l=ceph-devel&m=134808745719233) to estimate the > expected MDS memory consumption from file metadata caching. I am certain > the following is full of untested assumptions, some of which are > probably inaccurate, so please shoot those down as needed. > > I did an entirely unscientific study of a real data set (my laptop, in > case you care to know) which currently holds about 70G worth of data in > a huge variety of file sizes and several file systems, and currently > lists about 944,000 inodes as being in use. So going purely by order of > magnitude and doing a wild approximation, I'll assume a ratio of 1 > million files in 100G, or 10,000 files per gigabyte, which means an > average file size of about 100KB -- again, approximating and forgetting > about the difference between 10^3 and 2^10, and using a stupid > arithmetic mean rather than a median which would probably be much more > useful. > > If I were to assume that all those files were in CephFS, and they were > all somehow regularly in use (or at least one file in each directory), > then the Ceph MDS would have to keep the metadata of all those files in > cache. Suppose further that the stat struct for all those files is > anywhere between 1 and 2KB, and we go by an average of 1.5KB metadata > per file including some overhead, then that would mean the average > metadata per file is about 1.5% of the average file size. So for my 100G > of data, the MDS would use about 1.5G of RAM for caching. > > If you scale that up for a filestore of say a petabyte, that means all > your Ceph MDSs would consume a relatively whopping 15TB in total RAM for > metadata caching, again assuming that *all* the data is actually used by > clients. > > Now of course it's entirely unrealistic that in a production system data > is actually ever used across the board, but are the above considerations > "close enough" for a rule-of-thumb approximation of MDS memory > footprint? As in, > > Total MDS RAM = (Total used storage) * (fraction of data in regular use) > * 0.015 Out of curiosity: would it matter at all whether or not a significant fraction of the files in CephFS were hard links? Clearly the only thing that differs in metadata between individual hard-linked files is the file name, but I wonder if the Ceph MDS actually takes this into consideration. In other words, I'm not sure whether the MDS simply adds another pointer to the same set of metadata, or whether that set of metadata is actually duplicated in MDS memory. I am guessing the latter, but it would be nice to be sure. > If CephFS users could use a rule of thumb like that, it would help them > answer questions like "given a filesystem of size X, will a single MDS > be enough to hold my metadata caches if Y is the maximum amount of > memory I can afford for budget Z". > > All thoughts and comments much appreciated. Thank you! > > Cheers, > Florian _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com