On Tue, 8 Dec 2020 at 12:26, Michael Walle <michael@xxxxxxxx> wrote: > > Hi Ulf, Hi Ted, > > Am 2020-12-08 10:49, schrieb Ulf Hansson: > > On Tue, 8 Dec 2020 at 03:41, Theodore Y. Ts'o <tytso@xxxxxxx> wrote: > >> On Mon, Dec 07, 2020 at 09:39:32PM +0100, Michael Walle wrote: > >> > > There are three different MMC commands which are defined: > >> > > > >> > > 1) DISCARD > >> > > 2) ERASE > >> > > 3) SECURE ERASE > >> > > > >> > > The first two are expected to be fast, since it only involves clearing > >> > > some metadata fields in the Flash Translation Layer (FTL), so that the > >> > > LBA's in the specified range are no longer mapped to a flash page. > >> > > >> > Mh, where is it specified that the erase command is fast? According > >> > to the Physical Layer Simplified Specification Version 8.00: > >> > > >> > The actual erase time may be quite long, and the host may issue CMD7 > >> > to deselect thhe card or perform card disconnection, as described in > >> > the Block Write section, above. > > > > Before I go into some more detail, of course I fully agree that > > dealing with erase/discard from the eMMC/SD specifications (and other > > types of devices) point of view isn't entirely easy. :-) > > > > But I also think we can do better than currently, at least for eMMC/SD. > > > >> > >> I looked at the eMMC specification from JEDEC (JESD84-A44) and there, > >> both the "erase" and "trim" are specified that the work is to be > >> queued to be done at a time which is convenient to the controller > >> (read: FTL). This is in contrast to the "secure erase" and "secure > >> trim" commands, where the erasing has to be done NOW NOW NOW for "high > >> security applications". > > Oh this might also be because I've cited from the wrong place, namely > the > mmc_init_card() function. But what I really meant was the sd card > equivalent > which should be mmc_read_ssr(). Sorry. > > discard_support = UNSTUFF_BITS(resp, 313 - 288, 1); > card->erase_arg = (card->scr.sda_specx && discard_support) ? > SD_DISCARD_ARG : SD_ERASE_ARG; I assumed you were referring to this, but good that you pointed this out, for clarity. > > >> The only difference between "erase" and "trim" seems to be that erahse > >> has to be done in units of the "erase groups" which is typically > >> larger than the "write pages" which is the granularity required by the > >> trim command. There is also a comment that when you are erasing the > >> entire partition, "erase" is preferred over "trim". (Presumably > >> because it is more convenient? The spec is not clear.) > >> > >> Unfortunately, the SD Card spec and the eMMC spec both read like they > >> were written by a standards committee stacked by hardware engineers. > >> It doesn't look like they had file system engineers in the room, > >> because the distinctions between "erase" and "trim" are pretty silly, > >> and not well defined. Aside from what I wrote, the spec is remarkably > >> silent about what the host OS can depend upon. > > > > Moreover, the specs have evolved over the years. Somehow, we need to > > map a REQ_OP_DISCARD and to the best matching > > operation that the currently inserted eMMC/SD card supports... > > Do we really need to map these functions? What if we don't have an > actual discard, but just a slow erase (I'm now assuming that erase > will likely be slow on sdcards)? Can't we just tell the user space > there is no discard? Like on a normal HDD? I have considered that, but not sure what would be the best option. > I really don't know the > implications, seems like mmc_erase() is just there for the linux > discard feature. mmc_erase() is used for both REQ_OP_DISCARD and REQ_OP_SECURE_ERASE, but that's an implementation detail that we can change, of course. Honestly, the hole erase/discard support in the mmc core deserves a cleanup and I am looking at that (occasionally). > > Coming from the user space side. Does mkfs.ext4 assumes its pre-discard > is fast? I'd think so, right? I'd presume it was intented to tell the > FTL of the block device, "hey these blocks are unused, you can do some > wear leveling with them". I would assume that too. On the other hand, I guess there are situations when user space could live with slow formatting times. In particular if the goal is to let card clean up its internal garbage, as a way to improve "performance" for later I/O writes. > > > Long time time ago, both the SD and eMMC spec introduced support for > > real discards commands, as being hints to the card without any > > guarantees of what will happen to the data from a logical or a > > physical point of view. If the card supports that, we should use it as > > the first option for REQ_OP_DISCARD. Although, what should we pick as > > the second best option, when the card doesn't support discard - that's > > when it becomes more tricky. And the similar applies for > > REQ_OP_SECURE_ERASE, or course. > > > > If you have any suggestions for how we can improve in the above > > decisions, feel free to suggest something. > > > > Another issue that most likely is causing poor performance for > > REQ_OP_DISCARD/REQ_OP_SECURE_ERASE for eMMC/SD, is that in > > mmc_queue_setup_discard() we set up the maximum discard sectors > > allowed per request and the discard granularity. > > > > To find performance bottlenecks, I would start looking at what actual > > eMMC/SD commands/args we end up mapping towards the > > REQ_OP_DISCARD/REQ_OP_SECURE_ERASE requests. Then definitely, I would > > also look at the values we end up picking as max discard sectors and > > the discard granularity. > > I'm just about finding some SD cards and looking how they behave timing > wise and what they report they support (ie. erase or discard). Looks > like other cards are doing better. But I'd have to find out if they > support the discard (mine doesn't) and if they are slow too if I force > them to use the normal erase. Sounds great, looking forward to hear more about your findings. [...] Kind regards Uffe