James Bottomley wrote:
On Fri, 2008-12-12 at 08:27 -0500, Ric Wheeler wrote:
Theodore Tso wrote:
On Thu, Dec 11, 2008 at 07:24:31PM -0500, Ric Wheeler wrote:
More than a year back, I was the sacrificial Linux person invited to
represent Linux at IDEMA. At that point, I seem to remember that Vista
supported native 4k drives only on data partitions (non-boot) and that
they required a 1MB alignment (no more odd 512 byte sector offsets).
I can talk to the folks to confirm, but my understanding is that they
are resigned to random unaligned 4k writes because Windows does this.
When I told them that we tried very hard to do write coalescing and
filesystems could be made to understand to align things on RAID stripe
boundaries, they seemed surprised (because Windows doesn't do this).
So as far as I know 4k alignment is all they need. And this is
something very simple we can do, either in distribution installers
forcibly sending a configuration parameter to the partition editors,
or changing the partition editors to have better defaults, or changing
the kernel to report different fantasy geometries if we can't find a
valid MBR partition label.
Also, they seem to be talking about 2011 for the 4k sector rollout,
which means Windows 7....
The disk manufacturers basically know that they will get tons
(literally!) of returned disks if they don't emulate 512 byte support -
boot loaders, old BIOS's, etc all will generate these accesses.
It would be nice to get a mode bit that allowed you to test pure 4k
drives to help us insure that we do the right thing despite this.
Actually, there is; it's READ_CAPACITY(16) it contains fields showing
how many logical blocks per physical block there are. We could export
this to allow formatting tools to do the right thing. Note there was a
huge argument over this in the committee, so the alignment may not be
exact (some want odd alignment so that for dos labels they still get all
the partitions aligned on the physical boundary, which necessitates an
odd starting point), so we'd have to export the lowest aligned block
address as well.
The alignment mess the manufacturers created is all neatly documented in
SBC-3 section 4.5 (Physical Blocks). That also gives an example of
offset alignment.
What I would like to test with is the drives that don't do the emulation
(below, your case (3)). On the other hand, if all drives emulate the 512
byte requests without issue, we can probably ignore the boot sequence
and simply focus on getting alignment right today.
In effect, getting our partitions aligned on a 4k (or larger) boundary
is probably good and reasonable even without these drives.
The trick is to actually get your hands on these parts, I think that
they are starting to trickle out.
Right, Basically we know, because we've emulated it with scsi_debug
that linux just works (tm) with 4k sector disks. The problems we don't
know (because you can't boot with emultators) is whether the boot
sequence will work.
There are 3 cases:
1. standard 512 byte physical blocks: just do as today
2. 4k physical emulating 512 logical: Try to get the partition
alignments correct using the exported parameters, but otherwise
treat as 1.
3. 4k logical blocks. We *think* this all works provided the bios
can boot them, but we haven't had any samples to test.
James
I have no worries on the emulated 512 byte - that is basically an
internal firmware to the drive issue (and note that arrays have done
this without issue in linux for years, the EMC Symm has a 64K internal
"sector" size for example).
ric
--
To unsubscribe from this list: send the line "unsubscribe util-linux-ng" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html