The patch titled vfs: avoid large kmalloc()s for the fdtable has been removed from the -mm tree. Its filename was vfs-avoid-large-kmallocs-for-the-fdtable.patch This patch was dropped because it was merged into mainline or a subsystem tree The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/ ------------------------------------------------------ Subject: vfs: avoid large kmalloc()s for the fdtable From: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Azurit reports large increases in system time after 2.6.36 when running Apache. It was bisected down to a892e2d7dcdfa6c76e6 ("vfs: use kmalloc() to allocate fdmem if possible"). That patch caused the vfs to use kmalloc() for very large allocations and this is causing excessive work (and presumably excessive reclaim) within the page allocator. Fix it by falling back to vmalloc() earlier - when the allocation attempt would have been considered "costly" by reclaim. Reported-by: azurIt <azurit@xxxxxxxx> Tested-by: azurIt <azurit@xxxxxxxx> Cc: Changli Gao <xiaosuo@xxxxxxxxx> Cc: Americo Wang <xiyou.wangcong@xxxxxxxxx> Cc: Jiri Slaby <jslaby@xxxxxxx> Acked-by: Eric Dumazet <eric.dumazet@xxxxxxxxx> Cc: Mel Gorman <mel@xxxxxxxxx> Cc: <stable@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- fs/file.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff -puN fs/file.c~vfs-avoid-large-kmallocs-for-the-fdtable fs/file.c --- a/fs/file.c~vfs-avoid-large-kmallocs-for-the-fdtable +++ a/fs/file.c @@ -9,6 +9,7 @@ #include <linux/module.h> #include <linux/fs.h> #include <linux/mm.h> +#include <linux/mmzone.h> #include <linux/time.h> #include <linux/sched.h> #include <linux/slab.h> @@ -39,14 +40,17 @@ int sysctl_nr_open_max = 1024 * 1024; /* */ static DEFINE_PER_CPU(struct fdtable_defer, fdtable_defer_list); -static inline void *alloc_fdmem(unsigned int size) +static void *alloc_fdmem(unsigned int size) { - void *data; - - data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN); - if (data != NULL) - return data; - + /* + * Very large allocations can stress page reclaim, so fall back to + * vmalloc() if the allocation size will be considered "large" by the VM. + */ + if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) { + void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN); + if (data != NULL) + return data; + } return vmalloc(size); } _ Patches currently in -mm which might be from akpm@xxxxxxxxxxxxxxxxxxxx are origin.patch linux-next.patch next-remove-localversion.patch i-need-old-gcc.patch arch-alpha-kernel-systblss-remove-debug-check.patch drivers-i2c-busses-i2c-designware-corec-needs-delayh.patch arch-x86-include-asm-delayh-fix-udelay-and-ndelay-for-8-bit-args.patch drivers-gpu-drm-radeon-atomc-fix-warning.patch leds-route-kbd-leds-through-the-generic-leds-layer.patch backlight-add-backlight-type-fix.patch backlight-add-backlight-type-fix-fix.patch drivers-video-backlight-adp5520_blc-check-strict_strtoul-return-value-fix.patch drivers-message-fusion-mptsasc-fix-warning.patch osst-wrong-index-used-in-inner-loop-checkpatch-fixes.patch drivers-scsi-osstc-fix-warning.patch drbd-fix-warning.patch drivers-usb-misc-usbtestc-fix-warning.patch mm.patch mm-nommu-sort-mm-mmap-list-properly-fix.patch mm-per-node-vmstat-show-proper-vmstats-fix.patch mm-mem-hotplug-update-pcp-stat_threshold-when-memory-hotplug-occur-fix.patch mm-mmu_gather-rework-fix.patch mm-uninline-large-generic-tlbh-functions.patch mm-thp-optimize-memcg-charge-in-khugepaged-fix.patch mm-convert-mm-cpu_vm_cpumask-into-cpumask_var_t-checkpatch-fixes.patch writeback-sync-expired-inodes-first-in-background-writeback-fix.patch vmscan-change-shrink_slab-interfaces-by-passing-shrink_control-fix.patch vmscan-change-shrink_slab-interfaces-by-passing-shrink_control-fix-2.patch vmscan-change-shrinker-api-by-passing-shrink_control-struct-fix.patch vmscan-change-shrinker-api-by-passing-shrink_control-struct-fix-2.patch frv-duplicate-output_buffer-of-e03-checkpatch-fixes.patch hpet-factor-timer-allocate-from-open.patch arch-alpha-include-asm-ioh-s-extern-inline-static-inline.patch init-calibratec-fix-for-critical-bogomips-intermittent-calculation-failure-checkpatch-fixes.patch init-calibratec-fix-for-critical-bogomips-intermittent-calculation-failure-fix.patch printk-allocate-kernel-log-buffer-earlier-v2-checkpatch-fixes.patch printk-allocate-kernel-log-buffer-earlier-v2-fix.patch lru_cache-use-correct-type-in-sizeof-for-allocation-fix.patch percpu_counter-change-return-value-and-add-comments-fix.patch lib-hexdumpc-make-hex2bin-return-the-updated-src-address.patch fs-binfmt_miscc-use-kernels-hex_to_bin-method-fix.patch fs-binfmt_miscc-use-kernels-hex_to_bin-method-fix-fix.patch fs-ncpfs-inodec-suppress-used-uninitialised-warning.patch drivers-rtc-rtc-mrstc-use-release_mem_region-after-request_mem_region-fix.patch rtc-driver-for-pt7c4338-chip-checkpatch-fixes.patch rtc-driver-for-pt7c4338-chip-fix.patch documentation-accounting-getdelaysc-handle-sendto-failures.patch mm-move-enum-vm_event_item-into-a-standalone-header-file.patch add-the-pagefault-count-into-memcg-stats-fix.patch cpusets-randomize-node-rotor-used-in-cpuset_mem_spread_node.patch cpusets-randomize-node-rotor-used-in-cpuset_mem_spread_node-cpusets-initialize-spread-rotor-lazily-fix.patch fs-partitions-efic-corrupted-guid-partition-tables-can-cause-kernel-oops-fix.patch scatterlist-new-helper-functions.patch scatterlist-new-helper-functions-update-fix.patch kexec-remove-kmsg_dump_kexec-fix.patch journal_add_journal_head-debug.patch mutex-subsystem-synchro-test-module-fix.patch slab-leaks3-default-y.patch put_bh-debug.patch memblock-add-input-size-checking-to-memblock_find_region.patch memblock-add-input-size-checking-to-memblock_find_region-fix.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html