On 11/15/2014 12:48 PM, Andrey Korolyov wrote: > Hello, > > I had found recently that the OSD daemons under certain conditions > (moderate vm pressure, moderate I/O, slightly altered vm settings) can > go into loop involving isolate_freepages and effectively hit Ceph > cluster performance. I found this thread Do you feel it is a regression, compared to some older kernel version or something? > https://lkml.org/lkml/2012/6/27/545, but looks like that the > significant decrease of bdi max_ratio did not helped even for a bit. > Although I have approximately a half of physical memory for cache-like > stuff, the problem with mm persists, so I would like to try > suggestions from the other people. In current testing iteration I had > decreased vfs_cache_pressure to 10 and raised vm_dirty_ratio and > background ratio to 15 and 10 correspondingly (because default values > are too spiky for mine workloads). The host kernel is a linux-stable > 3.10. Well I'm glad to hear it's not 3.18-rc3 this time. But I would recommend trying it, or at least 3.17. Lot of patches went to reduce compaction overhead for (especially for transparent hugepages) since 3.10. > Non-default VM settings are: > vm.swappiness = 5 > vm.dirty_ratio=10 > vm.dirty_background_ratio=5 > bdi_max_ratio was 100%, right now 20%, at a glance it looks like the > situation worsened, because unstable OSD host cause domino-like effect > on other hosts, which are starting to flap too and only cache flush > via drop_caches is helping. > > Unfortunately there are no slab info from "exhausted" state due to > sporadic nature of this bug, will try to catch next time. > > slabtop (normal state): > Active / Total Objects (% used) : 8675843 / 8965833 (96.8%) > Active / Total Slabs (% used) : 224858 / 224858 (100.0%) > Active / Total Caches (% used) : 86 / 132 (65.2%) > Active / Total Size (% used) : 1152171.37K / 1253116.37K (91.9%) > Minimum / Average / Maximum Object : 0.01K / 0.14K / 15.75K > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > 6890130 6889185 99% 0.10K 176670 39 706680K buffer_head > 751232 721707 96% 0.06K 11738 64 46952K kmalloc-64 > 251636 226228 89% 0.55K 8987 28 143792K radix_tree_node > 121696 45710 37% 0.25K 3803 32 30424K kmalloc-256 > 113022 80618 71% 0.19K 2691 42 21528K dentry > 112672 35160 31% 0.50K 3521 32 56336K kmalloc-512 > 73136 72800 99% 0.07K 1306 56 5224K Acpi-ParseExt > 61696 58644 95% 0.02K 241 256 964K kmalloc-16 > 54348 36649 67% 0.38K 1294 42 20704K ip6_dst_cache > 53136 51787 97% 0.11K 1476 36 5904K sysfs_dir_cache > 51200 50724 99% 0.03K 400 128 1600K kmalloc-32 > 49120 46105 93% 1.00K 1535 32 49120K xfs_inode > 30702 30702 100% 0.04K 301 102 1204K Acpi-Namespace > 28224 25742 91% 0.12K 882 32 3528K kmalloc-128 > 28028 22691 80% 0.18K 637 44 5096K vm_area_struct > 28008 28008 100% 0.22K 778 36 6224K xfs_ili > 18944 18944 100% 0.01K 37 512 148K kmalloc-8 > 16576 15154 91% 0.06K 259 64 1036K anon_vma > 16475 14200 86% 0.16K 659 25 2636K sigqueue > > zoneinfo (normal state, attached) > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>