Hi Grant, Grant Grundler <grundler@xxxxxxxxxxxx> writes: > Ping? Does no one care how long BLK_SECDISCARD takes? > > ChromeOS has landed this change as a compromise between "fast" (<10 > seconds) and "minimize risk" (~90 seconds) for a 23GB partition on > eMMC: > https://chromium-review.googlesource.com/#/c/302413/ Including the patch would be helpful. I believe this is it. My comments are inline. diff --git a/block/blk-lib.c b/block/blk-lib.c index 8411be3..43943c7 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -60,21 +60,37 @@ granularity = max(q->limits.discard_granularity >> 9, 1U); alignment = (bdev_discard_alignment(bdev) >> 9) % granularity; - /* - * Ensure that max_discard_sectors is of the proper - * granularity, so that requests stay aligned after a split. - */ - max_discard_sectors = min(q->limits.max_discard_sectors, UINT_MAX >> 9); - max_discard_sectors -= max_discard_sectors % granularity; - if (unlikely(!max_discard_sectors)) { - /* Avoid infinite loop below. Being cautious never hurts. */ - return -EOPNOTSUPP; - } + max_discard_sectors = min(q->limits.max_discard_sectors, + UINT_MAX >> 9); Unnecessary reformatting. if (flags & BLKDEV_DISCARD_SECURE) { if (!blk_queue_secdiscard(q)) return -EOPNOTSUPP; type |= REQ_SECURE; + /* + * Secure erase performs better by telling the device + * about the largest range possible. Secure erase + * piecemeal will likely result in mapped sectors + * getting evacuated from one range and parked in + * another range that will get erased by a future + * erase command. This does NOT happen for normal + * TRIM or DISCARD operations. + * + * 32GB was a compromise to avoid blocking the device + * for potentially minute(s) at a time. + */ + if (max_discard_sectors < (1 << (25-9))) /* 32GiB */ + max_discard_sectors = 1 << (25-9); And here you're ignoring q->limits.max_discard_sectors. I'm surprised this worked! + } + + /* + * Ensure that max_discard_sectors is of the proper + * granularity, so that requests stay aligned after a split. + */ + max_discard_sectors -= max_discard_sectors % granularity; + if (unlikely(!max_discard_sectors)) { + /* Avoid infinite loop below. Being cautious never hurts. */ + return -EOPNOTSUPP; } atomic_set(&bb.done, 1); Grant, can we start over with the problem description? (Sorry, I didn't see the previous posts.) I'd like to know the values of discard_granularity and discard_max_bytes for your device. Additionally, it would be interesting to know how the discards are being initiatied. Is it via a userspace utility such as mkfs, online discard via some file system mounted with -o discard, or something else? Finally, can you post binary blktrace data somewhere for the slow case? Thanks! Jeff > On Mon, Sep 28, 2015 at 2:45 PM, Grant Grundler <grundler@xxxxxxxxxxxx> wrote: >> [resending...I forgot to switch gmail back to text-only mode. grrrh..] >> >> ---------- Forwarded message ---------- >> From: Grant Grundler <grundler@xxxxxxxxxxxx> >> Date: Mon, Sep 28, 2015 at 2:42 PM >> Subject: Re: RFC: 32-bit __data_len and REQ_DISCARD+REQ_SECURE >> To: Grant Grundler <grundler@xxxxxxxxxxxx> >> Cc: Jens Axboe <axboe@xxxxxxxxx>, Ulf Hansson >> <ulf.hansson@xxxxxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>, >> "linux-mmc@xxxxxxxxxxxxxxx" <linux-mmc@xxxxxxxxxxxxxxx> >> >> >> On Thu, Sep 24, 2015 at 10:39 AM, Grant Grundler <grundler@xxxxxxxxxxxx> wrote: >>> >>> Some followup. >> ... >>> >>> 2) I've been able to test this hack on an eMMC device: >>> [ 13.147747] mmc..._secdiscard_rq(mmc1) ERASE from 14116864 cnt >>> 0x2c00000 (size 22528 MiB) >>> [ 13.155964] sdhci cmd: 35/0x1a arg 0xd76800 >>> [ 13.160266] sdhci cmd: 36/0x1a arg 0x39767ff >>> [ 13.164593] sdhci cmd: 38/0x1b arg 0x80000000 >>> [ 13.803360] random: nonblocking pool is initialized >>> [ 14.567735] sdhci cmd: 13/0x1a arg 0x10000 >>> [ 14.573324] mmc..._secdiscard_rq(mmc1) err 0 >>> >>> This was with ~15K files and about 5GB written to the device. 1.4 >>> seconds compared to about 20 minutes to secure erase the same region >>> with original v3.18 code. >> >> >> To put a few more numbers on the "chunk size vs perf": >> 1EG (512KB) -> 44K commands -> ~20 minutes >> 32EG (16MB) -> 1375 commands -> ~1 minute >> 128EG (64MB) -> 344 commands -> ~30 seconds >> 8191EG (~4GB) -> 6 commands -> 2 seconds + ~8 seconds mkfs >> (I'm assuming times above include about 6-10 seconds of mkfs as part >> of writing a new file system) >> >> This is with only ~300MB of data written to the partition. I'm fully >> aware that times will vary depending on how much data needs to be >> migrated (and in this case very little or none). I'm certain the >> difference will only get worse for the smaller the "chunk size" used >> to Secure Erase due to repeated data migration. >> >> Given the different use model for secure erase (legal/contractually >> required behavior), is using 4GB chunk size acceptable? >> >> Would anyone be terribly offended if I used the recently added >> "MMC_IOC_MULTI_CMD" to send the cmd 35/36/38 sequence to the eMMC >> device to securely erase the offending partition? >> >> thanks, >> grant > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html