On Sun, Jan 24, 2010 at 12:14 AM, Martin K. Petersen <martin.petersen@xxxxxxxxxx> wrote: >>>>>> "Aleksander" == Aleksander Adamowski <linux@xxxxxxxxxx> writes: >> Logical/Physical Sector size: 512 bytes >> Do I have some cut down version WD EARS series drive? > > Well, despite the Advanced Format sticker you have a drive formatted > with 512-byte logical and physical blocks. > I did some extensive experiments and benchmarks, and the conclusion is that the drive actually uses 4 kB sectors internally, although it doesn't report that fact to the outside world. First, I had to choose a benchmark that would expose the difference in performance between aligned and unaligned partitions. After trying out bonnie++ (not configurable enough and poorly undocumented) and iozone (which makes sense only for either testing small blocks on small datafiles or large blocks on large datafiles - takes infinity^2 to complete with small blocks on large datafiles), I've decided to go with postmark. First, I've created a postmark.conf configuration for a long test: set location /mnt/sdb1 set seed 12345678 set read 1024 set write 1024 set buffering false set transactions 65536 set size 512 2048 set number 262144 run quit The number of initial files (262144) is deliberately large, so that when the benchmark starts, it will cover ~1 GB of pre-created data files - this way the 64 MB hardware cache of the drive shouldn't influence the read op results too much. Also, the RNG seed is set so that each benchmark run is analogous to the other ones, only shifted relative to drive sectors layout (per partition alignment). Each benchmark run was preceded by removing the old sdb1 partition, creating a new one with desired alignment, creating a fresh ext4 filesystem on it and running the benchmark. On a partition that started at 512B sector 34 (the default with GPT table format), the benchmark took 6177 seconds total, with 13 transactions per second: Time: 6177 seconds total 4814 seconds of transactions (13 per second) Files: 294832 created (47 per second) Creation alone: 262144 files (3404 per second) Mixed with transactions: 32688 files (6 per second) 32803 read (6 per second) 32721 appended (6 per second) 294832 deleted (47 per second) Deletion alone: 261984 files (203 per second) Mixed with transactions: 32848 files (6 per second) Data: 40.76 megabytes read (6.76 kilobytes per second) 371.60 megabytes written (61.60 kilobytes per second) With partition starting at sector 40, the same test took only 655 seconds - almost 10 times less time, reaching 119 transactions per second: Time: 655 seconds total 548 seconds of transactions (119 per second) Files: 294832 created (450 per second) Creation alone: 262144 files (10922 per second) Mixed with transactions: 32688 files (59 per second) 32803 read (59 per second) 32721 appended (59 per second) 294832 deleted (450 per second) Deletion alone: 261984 files (3156 per second) Mixed with transactions: 32848 files (59 per second) Data: 40.76 megabytes read (63.72 kilobytes per second) 371.60 megabytes written (580.94 kilobytes per second) A similarly good result was obtained with partition starting at sector 64: Time: 665 seconds total 575 seconds of transactions (113 per second) Files: 294832 created (443 per second) Creation alone: 262144 files (10922 per second) Mixed with transactions: 32688 files (56 per second) 32803 read (57 per second) 32721 appended (56 per second) 294832 deleted (443 per second) Deletion alone: 261984 files (3969 per second) Mixed with transactions: 32848 files (57 per second) Data: 40.76 megabytes read (62.76 kilobytes per second) 371.60 megabytes written (572.21 kilobytes per second) At this point, I've decided to make the postmark configuration for a quick test that takes pessimistically at most ~10 minutes: set location /mnt/sdb1 set seed 12345678 set read 1024 set write 1024 set buffering false set transactions 4096 set size 512 2048 set number 262144 run quit Then figured out to automate the whole process of dropping and recreating partitions, filesystems and running the benchmark - so I've written a script for that: http://olo.org.pl/files/hw/postmark-automated/do_postmark_tests.sh The script tries out all possible partition alignments from sector 34 to 64 and runs benchmarks on them. The results are published here: http://olo.org.pl/files/hw/postmark-automated/ These results clearly indicate that the drive has a sweet spot with partition starts aligned to sectors divisible by 8: performance on partitions starting at sectors 40, 48, 56, 64 is roughly 5.5 times better that on all others. This is a bit puzzling - 5.5 x faster is more that one would expect from a simple read-modify-write issue, isn't it? I've read about performance differences of 2:1, not 5.5:1. So for any other owners of WD EARS drives, if these don't report physical 4096-byte sectors to you, don't believe them and align your partitions at the aforementioned sectors (a generally good idea is to run the postmark benchark to compare performance on aligned and non-aligned partitions). Just in case anyone doesn't know how to align these partitions (WARNING: the instructions below will likely destroy any data that's on the given drive, only do this with drives you're intending to erase): # parted /dev/YOUR_DEVICE_NAME (parted) mklabel gpt # Here ^ I've chosen the GPT partition table format, but others may be OK too - untested by me. (parted) unit s # Here ^ we're choosing sectors as units of measurement (parted) mkpart primary ext2 40 -1 # Here ^ we're creating a partition that starts at sector 40, which is divisible by 8. # You can also try 48, 56, 64 and others - these should offer the same high performance, # but some space will go to waste - it's only some tiny kilobytes, though. # Parted will likely complain about the end location of the ending sector: Warning: You requested a partition from 40s to 2930277167s. The closest location we can manage is 40s to 2930277134s. Is this still acceptable to you? Yes/No? # Of course, we answer Yes. (parted) quit # After that, create a filesystem as usual, e.g: # mkfs.ext4 -T largefile4 /dev/YOUR_DEVICE_NAME This should get the optimum performance from your 4 kB physical sector drives even when they report 512 B sectors only to the OS. -- Best Regards, Aleksander Adamowski http://olo.org.pl -- To unsubscribe from this list: send the line "unsubscribe util-linux-ng" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html