Re: Revisiting MDS memory footprint

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 28, 2014 at 3:14 PM, Wido den Hollander <wido@xxxxxxxx> wrote:
> On 11/28/2014 01:04 PM, Florian Haas wrote:
>> Hi everyone,
>>
>> I'd like to come back to a discussion from 2012 (thread at
>> http://marc.info/?l=ceph-devel&m=134808745719233) to estimate the
>> expected MDS memory consumption from file metadata caching. I am certain
>> the following is full of untested assumptions, some of which are
>> probably inaccurate, so please shoot those down as needed.
>>
>> I did an entirely unscientific study of a real data set (my laptop, in
>> case you care to know) which currently holds about 70G worth of data in
>> a huge variety of file sizes and several file systems, and currently
>> lists about 944,000 inodes as being in use. So going purely by order of
>> magnitude and doing a wild approximation, I'll assume a ratio of 1
>> million files in 100G, or 10,000 files per gigabyte, which means an
>> average file size of about 100KB -- again, approximating and forgetting
>> about the difference between 10^3 and 2^10, and using a stupid
>> arithmetic mean rather than a median which would probably be much more
>> useful.
>>
>> If I were to assume that all those files were in CephFS, and they were
>> all somehow regularly in use (or at least one file in each directory),
>> then the Ceph MDS would have to keep the metadata of all those files in
>> cache. Suppose further that the stat struct for all those files is
>> anywhere between 1 and 2KB, and we go by an average of 1.5KB metadata
>> per file including some overhead, then that would mean the average
>> metadata per file is about 1.5% of the average file size. So for my 100G
>> of data, the MDS would use about 1.5G of RAM for caching.
>>
>> If you scale that up for a filestore of say a petabyte, that means all
>> your Ceph MDSs would consume a relatively whopping 15TB in total RAM for
>> metadata caching, again assuming that *all* the data is actually used by
>> clients.
>>
>
> Why do you assume that ALL MDSs keep ALL metadata in memory? Isn't the
> whole point of directory fragmentation that they all keep a bit of the
> inodes in memory to spread the load?

Directory subtree partitioning is considered neither stable nor
supported. Hence why it's important to understand what a single active
MDS will hold.

>> Now of course it's entirely unrealistic that in a production system data
>> is actually ever used across the board, but are the above considerations
>> "close enough" for a rule-of-thumb approximation of MDS memory
>> footprint? As in,
>>
>> Total MDS RAM = (Total used storage) * (fraction of data in regular use)
>> * 0.015
>>
>> If CephFS users could use a rule of thumb like that, it would help them
>> answer questions like "given a filesystem of size X, will a single MDS
>> be enough to hold my metadata caches if Y is the maximum amount of
>> memory I can afford for budget Z".
>>
>> All thoughts and comments much appreciated. Thank you!
>>
>> Cheers,
>> Florian
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux