Re: Insane CPU utilization in ceph.fuse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

How many time is neccesary?, because is a production environment and memory profiler + low cache size because the problem, gives a lot of CPU usage from OSD and MDS that makes it fails while profiler is running. Is there any problem if is done in a low traffic time? (less usage and maybe it don't fails, but maybe less info about usage).

Greetings! 

2018-07-24 10:21 GMT+02:00 Yan, Zheng <ukernel@xxxxxxxxx>:
I mean:

ceph tell mds.x heap start_profiler

... wait for some time

ceph tell mds.x heap stop_profiler

pprof --text  /usr/bin/ceph-mds
/var/log/ceph/ceph-mds.x.profile.<largest number>.heap




On Tue, Jul 24, 2018 at 3:18 PM Daniel Carrasco <d.carrasco@xxxxxxxxx> wrote:
>
> This is what i get:
>
> --------------------------------------------------------
> --------------------------------------------------------
> --------------------------------------------------------
> :/# ceph tell mds.kavehome-mgto-pro-fs01 heap dump
> 2018-07-24 09:05:19.350720 7fc562ffd700  0 client.1452545 ms_handle_reset on 10.22.0.168:6800/1685786126
> 2018-07-24 09:05:29.103903 7fc563fff700  0 client.1452548 ms_handle_reset on 10.22.0.168:6800/1685786126
> mds.kavehome-mgto-pro-fs01 dumping heap profile now.
> ------------------------------------------------
> MALLOC:      760199640 (  725.0 MiB) Bytes in use by application
> MALLOC: +            0 (    0.0 MiB) Bytes in page heap freelist
> MALLOC: +    246962320 (  235.5 MiB) Bytes in central cache freelist
> MALLOC: +     43933664 (   41.9 MiB) Bytes in transfer cache freelist
> MALLOC: +     41012664 (   39.1 MiB) Bytes in thread cache freelists
> MALLOC: +     10186912 (    9.7 MiB) Bytes in malloc metadata
> MALLOC:   ------------
> MALLOC: =   1102295200 ( 1051.2 MiB) Actual memory used (physical + swap)
> MALLOC: +   4268335104 ( 4070.6 MiB) Bytes released to OS (aka unmapped)
> MALLOC:   ------------
> MALLOC: =   5370630304 ( 5121.8 MiB) Virtual address space used
> MALLOC:
> MALLOC:          33027              Spans in use
> MALLOC:             19              Thread heaps in use
> MALLOC:           8192              Tcmalloc page size
> ------------------------------------------------
> Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
> Bytes released to the OS take up virtual address space but no physical memory.
>
>
> --------------------------------------------------------
> --------------------------------------------------------
> --------------------------------------------------------
> :/# ceph tell mds.kavehome-mgto-pro-fs01 heap stats
> 2018-07-24 09:14:25.747706 7f94fffff700  0 client.1452578 ms_handle_reset on 10.22.0.168:6800/1685786126
> 2018-07-24 09:14:25.754034 7f95057fa700  0 client.1452581 ms_handle_reset on 10.22.0.168:6800/1685786126
> mds.kavehome-mgto-pro-fs01 tcmalloc heap stats:------------------------------------------------
> MALLOC:      960649328 (  916.1 MiB) Bytes in use by application
> MALLOC: +            0 (    0.0 MiB) Bytes in page heap freelist
> MALLOC: +    108867288 (  103.8 MiB) Bytes in central cache freelist
> MALLOC: +     37179424 (   35.5 MiB) Bytes in transfer cache freelist
> MALLOC: +     40143000 (   38.3 MiB) Bytes in thread cache freelists
> MALLOC: +     10186912 (    9.7 MiB) Bytes in malloc metadata
> MALLOC:   ------------
> MALLOC: =   1157025952 ( 1103.4 MiB) Actual memory used (physical + swap)
> MALLOC: +   4213604352 ( 4018.4 MiB) Bytes released to OS (aka unmapped)
> MALLOC:   ------------
> MALLOC: =   5370630304 ( 5121.8 MiB) Virtual address space used
> MALLOC:
> MALLOC:          33028              Spans in use
> MALLOC:             19              Thread heaps in use
> MALLOC:           8192              Tcmalloc page size
> ------------------------------------------------
> Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
> Bytes released to the OS take up virtual address space but no physical memory.
>
> --------------------------------------------------------
> --------------------------------------------------------
> --------------------------------------------------------
> After heap release:
> :/# ceph tell mds.kavehome-mgto-pro-fs01 heap stats
> 2018-07-24 09:15:28.540203 7f2f7affd700  0 client.1443339 ms_handle_reset on 10.22.0.168:6800/1685786126
> 2018-07-24 09:15:28.547153 7f2f7bfff700  0 client.1443342 ms_handle_reset on 10.22.0.168:6800/1685786126
> mds.kavehome-mgto-pro-fs01 tcmalloc heap stats:------------------------------------------------
> MALLOC:      710315776 (  677.4 MiB) Bytes in use by application
> MALLOC: +            0 (    0.0 MiB) Bytes in page heap freelist
> MALLOC: +    246471880 (  235.1 MiB) Bytes in central cache freelist
> MALLOC: +     40802848 (   38.9 MiB) Bytes in transfer cache freelist
> MALLOC: +     38689304 (   36.9 MiB) Bytes in thread cache freelists
> MALLOC: +     10186912 (    9.7 MiB) Bytes in malloc metadata
> MALLOC:   ------------
> MALLOC: =   1046466720 (  998.0 MiB) Actual memory used (physical + swap)
> MALLOC: +   4324163584 ( 4123.8 MiB) Bytes released to OS (aka unmapped)
> MALLOC:   ------------
> MALLOC: =   5370630304 ( 5121.8 MiB) Virtual address space used
> MALLOC:
> MALLOC:          33177              Spans in use
> MALLOC:             19              Thread heaps in use
> MALLOC:           8192              Tcmalloc page size
> ------------------------------------------------
> Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
> Bytes released to the OS take up virtual address space but no physical memory.
>
>
> The other commands fails with a curl error:
> Failed to get profile: curl 'http:///pprof/profile?seconds=30' > /root/pprof/.tmp.ceph-mds.1532416424.:
>
>
> Greetings!!
>
> 2018-07-24 5:35 GMT+02:00 Yan, Zheng <ukernel@xxxxxxxxx>:
>>
>> could you profile memory allocation of mds
>>
>> http://docs.ceph.com/docs/mimic/rados/troubleshooting/memory-profiling/
>> On Tue, Jul 24, 2018 at 7:54 AM Daniel Carrasco <d.carrasco@xxxxxxxxx> wrote:
>> >
>> > Yeah, is also my thread. This thread was created before lower the cache size from 512Mb to 8Mb. I thought that maybe was my fault and I did a misconfiguration, so I've ignored the problem until now.
>> >
>> > Greetings!
>> >
>> > El mar., 24 jul. 2018 1:00, Gregory Farnum <gfarnum@xxxxxxxxxx> escribió:
>> >>
>> >> On Mon, Jul 23, 2018 at 11:08 AM Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote:
>> >>>
>> >>> On Mon, Jul 23, 2018 at 5:48 AM, Daniel Carrasco <d.carrasco@xxxxxxxxx> wrote:
>> >>> > Hi, thanks for your response.
>> >>> >
>> >>> > Clients are about 6, and 4 of them are the most of time on standby. Only two
>> >>> > are active servers that are serving the webpage. Also we've a varnish on
>> >>> > front, so are not getting all the load (below 30% in PHP is not much).
>> >>> > About the MDS cache, now I've the mds_cache_memory_limit at 8Mb.
>> >>>
>> >>> What! Please post `ceph daemon mds.<name> config diff`,  `... perf
>> >>> dump`, and `... dump_mempools `  from the server the active MDS is on.
>> >>>
>> >>> > I've tested
>> >>> > also 512Mb, but the CPU usage is the same and the MDS RAM usage grows up to
>> >>> > 15GB (on a 16Gb server it starts to swap and all fails). With 8Mb, at least
>> >>> > the memory usage is stable on less than 6Gb (now is using about 1GB of RAM).
>> >>>
>> >>> We've seen reports of possible memory leaks before and the potential
>> >>> fixes for those were in 12.2.6. How fast does your MDS reach 15GB?
>> >>> Your MDS cache size should be configured to 1-8GB (depending on your
>> >>> preference) so it's disturbing to see you set it so low.
>> >>
>> >>
>> >> See also the thread " Fwd: MDS memory usage is very high", which had more discussion of that. The MDS daemon seemingly had 9.5GB of allocated RSS but only believed 489MB was in use for the cache...
>> >> -Greg
>> >>
>> >>>
>> >>>
>> >>> --
>> >>> Patrick Donnelly
>> >>> _______________________________________________
>> >>> ceph-users mailing list
>> >>> ceph-users@xxxxxxxxxxxxxx
>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> --
> _________________________________________
>
>       Daniel Carrasco Marín
>       Ingeniería para la Innovación i2TIC, S.L.
>       Tlf:  +34 911 12 32 84 Ext: 223
>       www.i2tic.com
> _________________________________________



--
_________________________________________

      Daniel Carrasco Marín
      Ingeniería para la Innovación i2TIC, S.L.
      Tlf:  +34 911 12 32 84 Ext: 223
      www.i2tic.com
_________________________________________
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux