Hi Theodore,
Thank you for including me on this important topic.
I have recently subscribed to the util-liux-ng mailing list so I should
receive future email on such topics.
From a GParted perspective, there is a section of code that performs a
"round to cylinders" function. This function tries to align partitions
on the traditional CHS boundaries. It would be great if it were
possible to identify the devices that benefit from these 4096 byte
boundaries, as opposed to changing it for all devices.
Since GParted relies heavily on the libparted library for most all of
it's partition table operations, it is important to incorporate these
types of updates into the Parted project.
Regards,
Curtis Gedak
Theodore Ts'o wrote:
I attended the IDEMA (International Disk Drive Equipment and Materials
Association) conference today to give a talk about Linux, and during one
of the breaks I got buttonholed by someone who asked me if I could help
make sure Linux would be able to deal with the upcoming HDD sector size
move from 512 to 4096. Just coincidentally, I ran across the following
article from Slashdot, "Which Operating System Is Best For solid-state
disks":
http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyName=Storage&articleId=9123140&taxonomyId=19&pageNumber=1
Quoting from that article, Justin Sykes from Micron Technologies stated:
"NAND [flash memory] fundamentally has native 4K block
sizes. Anything that's not aligned to a 4K block creates extra
challenges," Sykes said. "There ends up being background
operations to garbage-collect that empty space [in larger file
blocks] that isn't fully utilized. And, so that activity is
chewing up your bandwidth in the background, and it adds extra
wear to the NAND [flash memory]."
I fully expect that perhaps someone from San Disk or Intel will pop up
and say that "this is just Micron's SSD's suck; *our* SSD's won't have
this problem". Perhaps; but HDD's won't be going away any time soon[1],
and they will be moving to a 4k block size in the next few years.
So what's the problem? The main problem seems to be that by default,
we are using partition tables that cause the partitions to be not
aligned on 4k boudaries, because of the default hdd geometry used by our
partition tools and returned by the HDIO_GETGEO ioctl:
Disk /dev/sda: 255 heads, 63 sectors, 38913 cylinders
Nr AF Hd Sec Cyl Hd Sec Cyl Start Size ID
1 80 1 1 0 254 63 121 63 1959867 83
2 00 0 1 122 254 63 619 1959930 8000370 82
3 00 0 1 620 254 63 1023 9960300 615177045 05
4 00 0 0 0 0 0 0 0 0 00
5 00 1 1 620 254 63 1023 63 615176982 8e
For pretty much all modern systems --- certainly any drive using the
SATA interface, the boot loader no longer needs to use the original CHS
INT13 interface, so what we pick for the CHS geometry doesn't matter as
far as bootloaders are concerned. Linux only uses LBA's so the bottom
line is that aside from controlling the alignment of partitions, CHS's
don't really matter.
For SSD's and HDD's that use a 4k internal sector size, being 4k aligned
makes a big difference because it avoids read-modify-write cycles. We
can achieve this easily if we simply use a CHS geometry of 56
sectors/track instead of 63 sectors. So, I would propose that we change
the default geometry used by the partitioning tools in util-linux-ng,
gparted, etc. so the default sectors is 56; furthermore, to catch those
partitioning tools that use the HDIO_GETGEO ioctl, that we change the
fantasy geometry generated in drivers/scsi/scsicam.c:scsicam_bios_param()
and drivers/ata/libata-scsi.c to also use a 255/56 head/sector geometry.
Does this make sense? Am I missing some fatal flaw? Should I send
patches?
- Ted
[1] There was an absolutely brilliant presentation at the IDEMA
conference from Steve Hetzler, an IBM Fellow from Almaden Research Lab,
that used an economic argument based the capital cost of the Fab's and
what would happen if one were to move *all* of the world's Silicon Fabs
to generating flash for SSD's --- this would only satisfy 18% of the HDD
market --- and the total size of the HDD market by revenue is $35
billion, and the value of the output of the Si Fab's today is $280
billion --- so are we going to give up $280 billion dollars worth of
revenue from the current products of today's available Fabs in order to
displace 18% of the HDD $35 billion market?
What about building new Fabs? Well, building new fabs sufficient to
create enough flash to replace all of the HDD market would cost
approximately one trillion dollars. A single Fab 45mm fab is $3-4
billion; and a 22mm Fab will probably cost be $7-8billion. (This is
just the cost to *build* the Fab; it ignores the materials and operating
cost, would be on top of this.) Intel brings on line maybe a fab or two
a year --- and Moore's law doesn't help that much, because the each
shrink quanduples the amount of Flash that can be created on each wafer,
but it also doubles the cost of the Fab; and the of the HDD market is
still increasing at 40% a year. Anyway, I'm not doing Dr. Hetzler's
talk justice, but bottom line, Aryan's claims that SDD's will completely
displace HDD's within five years may very well be a
little.... over-optimistic.
In other words, the flash production may be doubling every year, but
that was starting from a relatively small base compared to the HDD
market --- and to catch up and overtake the HDD market, it needs to do
far more than that --- and the model of using older fabs that had been
used for the previous generation of CPU's isn't going to be enough to
meet the demand, so *if* SSD's were to become as popular as some of the
SSD cheerleaders have stated, the current NAND oversupply could very
easily become an undersupply.
--
To unsubscribe from this list: send the line "unsubscribe util-linux-ng" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html