Hi Goncalo, On Fri, Jul 8, 2016 at 3:01 AM, Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx> wrote: > 5./ I have noticed that ceph-fuse (in 10.2.2) consumes about 1.5 GB of > virtual memory when there is no applications using the filesystem. > > 7152 root 20 0 1108m 12m 5496 S 0.0 0.0 0:00.04 ceph-fuse > > When I only have one instance of the user application running, ceph-fuse (in > 10.2.2) slowly rises with time up to 10 GB of memory usage. > > if I submit a large number of user applications simultaneously, ceph-fuse > goes very fast to ~10GB. > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 18563 root 20 0 10.0g 328m 5724 S 4.0 0.7 1:38.00 ceph-fuse > 4343 root 20 0 3131m 237m 12m S 0.0 0.5 28:24.56 dsm_om_connsvcd > 5536 goncalo 20 0 1599m 99m 32m R 99.9 0.2 31:35.46 python > 31427 goncalo 20 0 1597m 89m 20m R 99.9 0.2 31:35.88 python > 20504 goncalo 20 0 1599m 89m 20m R 100.2 0.2 31:34.29 python > 20508 goncalo 20 0 1599m 89m 20m R 99.9 0.2 31:34.20 python > 4973 goncalo 20 0 1599m 89m 20m R 99.9 0.2 31:35.70 python > 1331 goncalo 20 0 1597m 88m 20m R 99.9 0.2 31:35.72 python > 20505 goncalo 20 0 1597m 88m 20m R 99.9 0.2 31:34.46 python > 20507 goncalo 20 0 1599m 87m 20m R 99.9 0.2 31:34.37 python > 28375 goncalo 20 0 1597m 86m 20m R 99.9 0.2 31:35.52 python > 20503 goncalo 20 0 1597m 85m 20m R 100.2 0.2 31:34.09 python > 20506 goncalo 20 0 1597m 84m 20m R 99.5 0.2 31:34.42 python > 20502 goncalo 20 0 1597m 83m 20m R 99.9 0.2 31:34.32 python I've seen this type of thing before. It could be glibc's malloc arenas for threads. See: https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en I would guess there are 20 cores on this machine*? * 20 = 10GB/(8*64MB) If the cause here is glibc arenas, I don't think we need to do anything special. The virtual memory is not actually being used due to Linux overcommit. > 6./ On the machines where the user had the segfault, we have 16 GB of RAM > and 1GB of SWAP > > Mem: 16334244k total, 3590100k used, 12744144k free, 221364k buffers > Swap: 1572860k total, 10512k used, 1562348k free, 2937276k cached But do we know that ceph-fuse is using 10G VM on those machines (the core count may be different)? > 7./ I think what is happening is that once the user submits his sets of > jobs, the memory usage goes to the very limit on this type machine, and the > raise is actually to fast that ceph-fuse segfaults before OOM Killer can > kill it. It's possible but we have no evidence yet that ceph-fuse is using up all the memory on those machines yet, right? -- Patrick Donnelly _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com