Re: [PATCH] e2fsck: Discard free data and inode blocks.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 On 10/22/2010 11:37 AM, Eric Sandeen wrote:
Ric Wheeler wrote:

...

Well, so far the only breakages I have seen was with lots of small TRIMs
(or UNMAPs, etc) issued in random pattern, never in case of mkfs which
is quite a opposite - big sequential ranges.

Hangs should be covered by those two patches:

http://marc.info/?l=linux-ext4&m=128774558623608&w=2
http://marc.info/?l=linux-ext4&m=128767099123375&w=2

if, of course, they get upstream. Also there is a big win, when discard
also zeroes data, because in that case we can just skip inode table
initialization (zeroing) without any need of in-kernel lazyinit code
enabled. And we get all this for free. It was introduced with Sandeens
patch:

http://marc.info/?l=linux-ext4&m=128234048208327&w=2

So, I would rather leave it on by default.

-Lukas
You cannot 100% depend on discard zeroing blocks - that is not a
universal requirement of devices that support it. Specifically, for ATA
devices, I think that there are optional bits that specify how a device
will behave when you read from a trimmed region.
But don't we have the ability to test whether discard -does- zero blocks,
as advertised by the device?  And honestly if the device mis-reports, that
sounds like a device vendor problem to fix.

The proposal wasn't to discard and assume zero, but to check for that
behavior:

http://kerneltrap.org/mailarchive/linux-ext4/2010/9/21/6885628/thread

+		if (!retval&&  mke2fs_discard_zeroes_data(fs)) {
+			if (verbose)
+				printf(_("Discard succeeded and will return 0s "
+					 " - enabling lazy_itable_init\n"));
+			lazy_itable_init = 1;
+			lazy_itable_zeroed = 1;
+		}

so we're not depending on it zeroing blocks, we're just depending on it
advertising correctly whether or not it -does- zero.

-Eric



I think that ATA devices have historically not done this correctly, but the T13 committee is working on it. The question is whether the bit we check and rely on has the right semantics (and then if the device will reliably implement this).

Historically, array vendors did rely on SCSI commands like the old fashioned "WRITE_SAME" to initialize storage for them, but that takes a *long* time to run :)

Ric


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux