Re: Howto for properly partitioning new drives with 4096 byte sectors (like Western Digital Advanced Format EARS drives)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 24, 2010 at 12:14 AM, Martin K. Petersen
<martin.petersen@xxxxxxxxxx> wrote:
>>>>>> "Aleksander" == Aleksander Adamowski <linux@xxxxxxxxxx> writes:
>>         Logical/Physical Sector size:           512 bytes
>> Do I have some cut down version  WD EARS series drive?
>
> Well, despite the Advanced Format sticker you have a drive formatted
> with 512-byte logical and physical blocks.
>

I did some extensive experiments and benchmarks, and the conclusion is
that the drive actually uses 4 kB sectors internally, although it
doesn't report that fact to the outside world.

First, I had to choose a benchmark that would expose the difference in
performance between aligned and unaligned partitions.

After trying out bonnie++ (not configurable enough and poorly
undocumented) and iozone (which makes sense only for either testing
small blocks on small datafiles or large blocks on large datafiles -
takes infinity^2 to complete with small blocks on large datafiles),
I've decided to go with postmark.

First, I've created a postmark.conf configuration for a long test:

set location /mnt/sdb1
set seed 12345678
set read 1024
set write 1024
set buffering false
set transactions 65536
set size 512 2048
set number 262144
run
quit

The number of initial files (262144) is deliberately large, so that
when the benchmark starts, it will cover ~1 GB of pre-created data
files - this way the 64 MB hardware cache of the drive shouldn't
influence the read op results too much.

Also, the RNG seed is set so that each benchmark run is analogous to
the other ones, only shifted relative to drive sectors layout (per
partition alignment).

Each benchmark run was preceded by removing the old sdb1 partition,
creating a new one with desired alignment, creating a fresh ext4
filesystem on it and running the benchmark.

On a partition that started at 512B sector 34 (the default with GPT
table format), the benchmark took 6177 seconds total, with 13
transactions per second:

Time:
        6177 seconds total
        4814 seconds of transactions (13 per second)

Files:
        294832 created (47 per second)
                Creation alone: 262144 files (3404 per second)
                Mixed with transactions: 32688 files (6 per second)
        32803 read (6 per second)
        32721 appended (6 per second)
        294832 deleted (47 per second)
                Deletion alone: 261984 files (203 per second)
                Mixed with transactions: 32848 files (6 per second)

Data:
        40.76 megabytes read (6.76 kilobytes per second)
        371.60 megabytes written (61.60 kilobytes per second)


With partition starting at sector 40, the same test took only 655
seconds - almost 10 times less time, reaching 119 transactions per
second:

Time:
        655 seconds total
        548 seconds of transactions (119 per second)

Files:
        294832 created (450 per second)
                Creation alone: 262144 files (10922 per second)
                Mixed with transactions: 32688 files (59 per second)
        32803 read (59 per second)
        32721 appended (59 per second)
        294832 deleted (450 per second)
                Deletion alone: 261984 files (3156 per second)
                Mixed with transactions: 32848 files (59 per second)

Data:
        40.76 megabytes read (63.72 kilobytes per second)
        371.60 megabytes written (580.94 kilobytes per second)


A similarly good result was obtained with partition starting at sector 64:

Time:
        665 seconds total
        575 seconds of transactions (113 per second)

Files:
        294832 created (443 per second)
                Creation alone: 262144 files (10922 per second)
                Mixed with transactions: 32688 files (56 per second)
        32803 read (57 per second)
        32721 appended (56 per second)
        294832 deleted (443 per second)
                Deletion alone: 261984 files (3969 per second)
                Mixed with transactions: 32848 files (57 per second)

Data:
        40.76 megabytes read (62.76 kilobytes per second)
        371.60 megabytes written (572.21 kilobytes per second)


At this point, I've decided to make the postmark configuration for a
quick test that takes pessimistically at most ~10 minutes:

set location /mnt/sdb1
set seed 12345678
set read 1024
set write 1024
set buffering false
set transactions 4096
set size 512 2048
set number 262144
run
quit


Then figured out to automate the whole process of dropping and
recreating partitions, filesystems and running the benchmark - so I've
written a script for that:

http://olo.org.pl/files/hw/postmark-automated/do_postmark_tests.sh

The script tries out all possible partition alignments from sector 34
to 64 and runs benchmarks on them.

The results are published here:
http://olo.org.pl/files/hw/postmark-automated/

These results clearly indicate that the drive has a sweet spot with
partition starts aligned to sectors divisible by 8: performance on
partitions starting at sectors 40, 48, 56, 64 is roughly 5.5 times
better that on all others.

This is a bit puzzling - 5.5 x faster is more that one would expect
from a simple read-modify-write issue, isn't it? I've read about
performance differences of 2:1, not 5.5:1.


So for any other owners of WD EARS drives, if these don't report
physical 4096-byte sectors to you, don't believe them and align your
partitions at the aforementioned sectors (a generally good idea is to
run the postmark benchark to compare performance on aligned and
non-aligned partitions).

Just in case anyone doesn't know how to align these partitions
(WARNING: the instructions below will likely destroy any data that's
on the given drive, only do this with drives you're intending to
erase):

# parted /dev/YOUR_DEVICE_NAME

(parted) mklabel gpt
# Here ^ I've chosen the GPT partition table format, but others may be
OK too - untested by me.

(parted) unit s
# Here ^ we're choosing sectors as units of measurement

(parted) mkpart primary ext2 40 -1
# Here ^ we're creating a partition that starts at sector 40, which is
divisible by 8.
# You can also try 48, 56, 64 and others - these should offer the same
high performance,
# but some space will go to waste - it's only some tiny kilobytes, though.

# Parted will likely complain about the end location of the ending sector:
Warning: You requested a partition from 40s to 2930277167s.
The closest location we can manage is 40s to 2930277134s.
Is this still acceptable to you?
Yes/No?
# Of course, we answer Yes.

(parted) quit

# After that, create a filesystem as usual, e.g:
# mkfs.ext4 -T largefile4 /dev/YOUR_DEVICE_NAME

This should get the optimum performance from your 4 kB physical sector
drives even when they report 512 B sectors only to the OS.


-- 
Best Regards,
  Aleksander Adamowski
  http://olo.org.pl
--
To unsubscribe from this list: send the line "unsubscribe util-linux-ng" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux