Andreas Dilger (adilger@xxxxxxxxx) wrote on 14 June 2012 23:34: >On 2012-06-14, at 10:20 PM, Carlos Carvalho wrote: >> Our server has suddenly become extremely slow in writes. With >> >> % dd if=/dev/zero of=zero bs=2M count=2000 >> >> I get only 1.4MB/s in an almost idle machine. It has a raid6 with 6 >> disks, running 3.3.7. 3.4.2 doesn't improve matters. >> >> The important point is that it became so slow yesterday. No hardware >> has changed and all disks are fine. Reading from the filesystem is >> unaffected, at more than 85MB/s. The problem happens both in the root >> and home partitions. >> >> During the dd the disk utilization measured by sar is small: >> >> DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util >> sdb 93.30 254.40 931.20 12.71 0.57 6.10 4.16 38.77 >> sdc 100.50 244.00 970.40 12.08 0.76 7.59 3.80 38.20 >> sdd 101.40 261.60 927.20 11.72 0.68 6.68 3.74 37.90 >> sde 86.80 300.00 780.80 12.45 0.69 7.93 4.29 37.20 >> sdf 82.90 315.20 810.40 13.58 0.55 6.60 4.39 36.37 >> sda 96.70 220.00 984.00 12.45 0.47 4.87 3.72 35.94 >> >> So it seems that it's not a disk problem. Both filesystems reached 90% >> usage in space but less than 10% in inodes shortly before the slowness >> appeared. Now they're at 57% and 77% but still crawl. Could this be >> related? > >Definitely yes. I was going to ask this question even before I got to >the end and saw your comment. If the filesystem gets too full, it can >cause new allocations to become fragmented, and then even if you delete >other files, the fragmented files end up leaving individual allocated >blocks all around your filesystem, making future allocations bad also. > >You can run "e2freefrag" on the block device to report the current sizes >for free space in the filesystem. There is the "e4defrag" tool that can >defragment files, but I don't recall whether this is working well or not. Thanks. Here's the output for two of the filesystems: Device: /dev/md1 Blocksize: 4096 bytes Total blocks: 2621376 Free blocks: 1208294 (46.1%) Min. free extent: 4 KB Max. free extent: 1482688 KB Avg. free extent: 492 KB Num. free extent: 9817 HISTOGRAM OF FREE EXTENT SIZES: Extent Size Range : Free extents Free Blocks Percent 4K... 8K- : 983 983 0.08% 8K... 16K- : 1067 2580 0.21% 16K... 32K- : 1700 9243 0.76% 32K... 64K- : 3245 37692 3.12% 64K... 128K- : 812 18279 1.51% 128K... 256K- : 703 31845 2.64% 256K... 512K- : 573 49964 4.14% 512K... 1024K- : 395 66984 5.54% 1M... 2M- : 222 79026 6.54% 2M... 4M- : 54 35170 2.91% 4M... 8M- : 14 17754 1.47% 8M... 16M- : 13 39447 3.26% 16M... 32M- : 15 85868 7.11% 32M... 64M- : 17 203310 16.83% 64M... 128M- : 1 18211 1.51% 128M... 256M- : 1 64895 5.37% 256M... 512M- : 1 76372 6.32% 1G... 2G- : 1 370672 30.68% Device: /dev/md3 Blocksize: 4096 bytes Total blocks: 488378357 Free blocks: 122356070 (25.1%) Min. free extent: 4 KB Max. free extent: 2096512 KB Avg. free extent: 9416 KB Num. free extent: 51348 HISTOGRAM OF FREE EXTENT SIZES: Extent Size Range : Free extents Free Blocks Percent 4K... 8K- : 7303 7303 0.01% 8K... 16K- : 4413 10063 0.01% 16K... 32K- : 2766 13920 0.01% 32K... 64K- : 3647 40394 0.03% 64K... 128K- : 2751 61344 0.05% 128K... 256K- : 2613 116467 0.10% 256K... 512K- : 2717 251263 0.21% 512K... 1024K- : 3511 658799 0.54% 1M... 2M- : 5158 1833440 1.50% 2M... 4M- : 6008 4421326 3.61% 4M... 8M- : 5247 7242579 5.92% 8M... 16M- : 1787 4422718 3.61% 16M... 32M- : 1442 7825134 6.40% 32M... 64M- : 719 7570736 6.19% 64M... 128M- : 547 14188577 11.60% 128M... 256M- : 309 13895787 11.36% 256M... 512M- : 203 18558351 15.17% 512M... 1024M- : 176 28142764 23.00% 1G... 2G- : 31 11647234 9.52% It seems to me that free space is not excessively fragmented. Anyway, in our case it was not the problem. I managed to put the disks in another machine and there the behaviour was normal. Putting the disks back in the production machine it is normal again(?!)... I had to unplug all power cords first, otherwise it wouldn't even enter the bios. While running 3.4.2 there appeared some call traces and nfs stopped. This didn't happen with 3.3.7 even when it was crawling, so I don't know if it's a kernel problem. Here's the log, just in case: Jun 14 14:13:38 hoggar kernel: Pid: 17751, comm: nfsd Tainted: G D 3.4.2 #1 ASUSTek Computer INC. KFSN5-D/KFSN5-D Jun 14 14:13:38 hoggar kernel: EIP: 0060:[<0005bd9c>] EFLAGS: 00010246 CPU: 0 Jun 14 14:13:38 hoggar kernel: EAX: fd8977e0 EBX: 00000000 ECX: 3686d477 EDX: 00000000 Jun 14 14:13:38 hoggar kernel: ESI: 00000000 EDI: 00000000 EBP: 00000001 ESP: e6281bd4 Jun 14 14:13:38 hoggar kernel: DS: 0068 ES: 0068 FS: 00d8 GS: 0068 SS: 0068 Jun 14 14:13:38 hoggar kernel: CR0: 8005003b CR2: 00000000 CR3: 01235000 CR4: 000007f0 Jun 14 14:13:38 hoggar kernel: DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 Jun 14 14:13:38 hoggar kernel: DR6: ffff0ff0 DR7: 00000400 Jun 14 14:13:38 hoggar kernel: Process nfsd (pid: 17751, ti=c2f19340 task=c2f190c0 task.ti=c2f19340) Jun 14 14:13:38 hoggar kernel: Stack: Jun 14 14:13:38 hoggar kernel: 0000005d 49bdf715 fff91232 00000000 0000003c 00000000 00000000 ee9f2334 Jun 14 14:13:38 hoggar kernel: 00000b90 0005c741 000000b1 00000000 00000001 000000b1 00000001 02809f00 Jun 14 14:13:38 hoggar kernel: c2f190c0 00000000 00000014 01000000 c2f190c0 00000001 00001770 ffffffff Jun 14 14:13:38 hoggar kernel: Call Trace: Jun 14 14:13:38 hoggar kernel: [<0005c741>] ? balance_dirty_pages_ratelimited_nr+0x303/0x517 Jun 14 14:13:38 hoggar kernel: [<00058015>] ? generic_file_buffered_write+0x176/0x1e0 Jun 14 14:13:38 hoggar kernel: [<00006b3e>] ? ich_force_enable_hpet+0xc/0x124 Jun 14 14:13:38 hoggar kernel: [<0005847e>] ? __generic_file_aio_write+0x3ff/0x431 Jun 14 14:13:38 hoggar kernel: [<00007b3e>] ? tsc_refine_calibration_work+0x13c/0x16c Jun 14 14:13:38 hoggar kernel: [<00007b3e>] ? tsc_refine_calibration_work+0x13c/0x16c Jun 14 14:13:38 hoggar kernel: [<001bc2cc>] ? memcpy_toiovec+0xcf/0x107 Jun 14 14:13:38 hoggar kernel: [<0008702b>] ? find_inode_fast+0x3e/0x6a Jun 14 14:13:38 hoggar kernel: [<00007b3e>] ? tsc_refine_calibration_work+0x13c/0x16c Jun 14 14:13:38 hoggar kernel: [<0005850d>] ? generic_file_aio_write+0x5d/0xb3 Jun 14 14:13:38 hoggar kernel: [<000bc397>] ? ext4_file_mmap+0x39/0x39 Jun 14 14:13:38 hoggar kernel: [<000bc57c>] ? ext4_file_write+0x1e5/0x233 Jun 14 14:13:38 hoggar kernel: [<000eb5fc>] ? nfsd_acceptable+0x6e/0x74 Jun 14 14:13:38 hoggar kernel: [<000bc397>] ? ext4_file_mmap+0x39/0x39 Jun 14 14:13:38 hoggar kernel: [<00075373>] ? do_sync_readv_writev+0x9e/0xd3 Jun 14 14:13:38 hoggar kernel: [<00007b3e>] ? tsc_refine_calibration_work+0x13c/0x16c Jun 14 14:13:38 hoggar kernel: [<00007b3e>] ? tsc_refine_calibration_work+0x13c/0x16c Jun 14 14:13:38 hoggar kernel: [<00007b3e>] ? tsc_refine_calibration_work+0x13c/0x16c Jun 14 14:13:38 hoggar kernel: [<00007b3e>] ? tsc_refine_calibration_work+0x13c/0x16c Jun 14 14:13:38 hoggar kernel: [<0007566b>] ? do_readv_writev+0x81/0x128 Jun 14 14:13:38 hoggar kernel: [<00007b3e>] ? tsc_refine_calibration_work+0x13c/0x16c Jun 14 14:13:38 hoggar kernel: [<000bc397>] ? ext4_file_mmap+0x39/0x39 Jun 14 14:13:38 hoggar kernel: [<00075782>] ? vfs_writev+0x37/0x3c Jun 14 14:13:38 hoggar kernel: [<000ec1e4>] ? nfsd_vfs_write.isra.13+0x92/0x1ba Jun 14 14:13:38 hoggar kernel: [<0007255b>] ? kmem_cache_alloc+0x46/0x8e Jun 14 14:13:38 hoggar kernel: [<000bc252>] ? ext4_release_file+0x76/0x76 Jun 14 14:13:38 hoggar kernel: [<0007349b>] ? __dentry_open.isra.17+0x153/0x220 Jun 14 14:13:38 hoggar kernel: [<00008001>] ? native_calibrate_tsc+0x3e0/0x4e5 Jun 14 14:13:38 hoggar kernel: [<000edb02>] ? nfsd_write+0xad/0xc2 Jun 14 14:13:38 hoggar kernel: [<000f1aa5>] ? nfsd3_proc_write+0x85/0x9a Jun 14 14:13:38 hoggar kernel: [<00007b3e>] ? tsc_refine_calibration_work+0x13c/0x16c Jun 14 14:13:38 hoggar kernel: [<000e9fb5>] ? nfsd_dispatch+0x84/0x12a Jun 14 14:13:38 hoggar kernel: [<00210fdc>] ? svc_process+0x385/0x5fc Jun 14 14:13:38 hoggar kernel: [<000186a3>] ? arch_pick_mmap_layout+0xf/0x11c Jun 14 14:13:38 hoggar kernel: [<00007bc4>] ? native_sched_clock+0x56/0xad Jun 14 14:13:38 hoggar kernel: [<000e9a97>] ? nfsd+0xc7/0xfc Jun 14 14:13:38 hoggar kernel: [<000e99d0>] ? exportfs_decode_fh+0x218/0x218 Jun 14 14:13:38 hoggar kernel: [<000308a8>] ? kthread+0x68/0x6d Jun 14 14:13:38 hoggar kernel: [<00030840>] ? kthread_flush_work_fn+0x8/0x8 Jun 14 14:13:38 hoggar kernel: [<0022cc86>] ? kernel_thread_helper+0x6/0xd Jun 14 14:13:38 hoggar kernel: Code: f1 89 ca 29 c2 39 54 24 30 73 2c 89 c8 2b 44 24 30 8b 5c 24 08 46 0f af d8 f7 64 24 04 01 da 31 db 89 c1 39 f2 72 08 89 d0 31 d2 <f7> f6 89 c3 89 c8 f7 f6 89 da eb 1d 8b 4c 24 08 31 d2 c1 f9 1f Jun 14 14:13:38 hoggar kernel: EIP: [<0005bd9c>] bdi_position_ratio.isra.14+0x141/0x1a9 SS:ESP 0068:e6281bd4 Jun 14 14:13:38 hoggar kernel: ---[ end trace c19aa5d263f5a20d ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html