On 11/28/2014 01:04 PM, Florian Haas wrote: > Hi everyone, > > I'd like to come back to a discussion from 2012 (thread at > http://marc.info/?l=ceph-devel&m=134808745719233) to estimate the > expected MDS memory consumption from file metadata caching. I am certain > the following is full of untested assumptions, some of which are > probably inaccurate, so please shoot those down as needed. > > I did an entirely unscientific study of a real data set (my laptop, in > case you care to know) which currently holds about 70G worth of data in > a huge variety of file sizes and several file systems, and currently > lists about 944,000 inodes as being in use. So going purely by order of > magnitude and doing a wild approximation, I'll assume a ratio of 1 > million files in 100G, or 10,000 files per gigabyte, which means an > average file size of about 100KB -- again, approximating and forgetting > about the difference between 10^3 and 2^10, and using a stupid > arithmetic mean rather than a median which would probably be much more > useful. > > If I were to assume that all those files were in CephFS, and they were > all somehow regularly in use (or at least one file in each directory), > then the Ceph MDS would have to keep the metadata of all those files in > cache. Suppose further that the stat struct for all those files is > anywhere between 1 and 2KB, and we go by an average of 1.5KB metadata > per file including some overhead, then that would mean the average > metadata per file is about 1.5% of the average file size. So for my 100G > of data, the MDS would use about 1.5G of RAM for caching. > > If you scale that up for a filestore of say a petabyte, that means all > your Ceph MDSs would consume a relatively whopping 15TB in total RAM for > metadata caching, again assuming that *all* the data is actually used by > clients. > Why do you assume that ALL MDSs keep ALL metadata in memory? Isn't the whole point of directory fragmentation that they all keep a bit of the inodes in memory to spread the load? > Now of course it's entirely unrealistic that in a production system data > is actually ever used across the board, but are the above considerations > "close enough" for a rule-of-thumb approximation of MDS memory > footprint? As in, > > Total MDS RAM = (Total used storage) * (fraction of data in regular use) > * 0.015 > > If CephFS users could use a rule of thumb like that, it would help them > answer questions like "given a filesystem of size X, will a single MDS > be enough to hold my metadata caches if Y is the maximum amount of > memory I can afford for budget Z". > > All thoughts and comments much appreciated. Thank you! > > Cheers, > Florian > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com