> (+CC Avri and Adrian) Personally, I don't have a strong opinion of any of this. I expect the hw to handle ungraceful voltage drops via its internal mechanisms. Up to some point, that is. Thanks, Avri > > On 2/21/25 09:39, Oleksij Rempel wrote: > > Introduce `_mmc_handle_undervoltage()` to handle undervoltage events > > for MMC/eMMC devices. This function interrupts ongoing operations > > using High Priority Interrupt (HPI) and performs a controlled suspend. > > After completing the sequence, the card is marked as removed to > > prevent further interactions, ensuring that no further commands are > > issued after an emergency stop. > > > > Implementation Details: > > 1. **Interrupt ongoing operations**: > > - If the eMMC is executing a long-running operation (e.g., erase, trim, > > or write), > > attempt to stop it using HPI (`mmc_interrupt_hpi()`). > > - If HPI fails, an error is logged, but the sequence continues. > > > > 2. **Suspend the card in an emergency state**: > > - Call `__mmc_suspend()` with `is_undervoltage = true`, which ensures: > > - The power-off notification uses `EXT_CSD_POWER_OFF_SHORT`. > > - Cache flushing is skipped to minimize time delays. > > - If power-off notify is unsupported, alternative methods like sleep > > or deselect are used to transition the card into a safe state. > > > > 3. **Mark the card as removed**: > > - This prevents further commands from being issued to the card after > > undervoltage shutdown, avoiding potential corruption. > > > > To support this, introduce `__mmc_suspend()` and `__mmc_resume()` as > > internal helpers that omit `mmc_claim_host()/mmc_release_host()`, > > allowing them to be called when the host is already claimed. > > > > The caller of `_mmc_handle_undervoltage()` is responsible for invoking > > `mmc_claim_host()` before calling this function and > > `mmc_release_host()` afterward to ensure exclusive access to the host > > during the emergency shutdown process. > > > > Device Handling Considerations: > > - **For eMMC storage**: The new undervoltage handler applies the correct > > power-down sequence using power-off notify or alternative methods. > > - **For SD cards**: The current implementation does not handle > undervoltage > > events for SD cards. Future extensions may be needed to implement proper > > handling. > > > > Testing: > > This implementation was tested on an iMX8MP-based system, verifying > > that the undervoltage sequence correctly stops ongoing operations and > > prevents further MMC transactions after the event. The board had > > approximately 100ms of available power hold-up time. The Power Off > > Notification was sent ~4ms after the board was detached from the > > power supply, allowing sufficient time for the eMMC to handle the > > event properly. > > > > The testing was performed using a logic analyzer to monitor command > > sequences and timing. While this method confirms that the expected > > sequence was executed, it does not provide insights into the actual > > internal behavior of the eMMC storage. > > > > Signed-off-by: Oleksij Rempel <o.rempel@xxxxxxxxxxxxxx> > > --- > > changes v3: > > - reword commit message. > > - add comments in the code > > - do not try to resume sleeping device > > --- > > drivers/mmc/core/mmc.c | 115 > > ++++++++++++++++++++++++++++++++++++----- > > 1 file changed, 102 insertions(+), 13 deletions(-) > > > > diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c index > > 9270bde445ad..a50cdd550a22 100644 > > --- a/drivers/mmc/core/mmc.c > > +++ b/drivers/mmc/core/mmc.c > > @@ -2104,8 +2104,8 @@ static int _mmc_flush_cache(struct mmc_host > *host) > > return err; > > } > > > > -static int _mmc_suspend(struct mmc_host *host, bool is_suspend, > > - bool is_undervoltage) > > +static int __mmc_suspend(struct mmc_host *host, bool is_suspend, > > + bool is_undervoltage) > > { > > unsigned int notify_type; > > int err = 0; > > @@ -2116,8 +2116,6 @@ static int _mmc_suspend(struct mmc_host > *host, bool is_suspend, > > else > > notify_type = EXT_CSD_POWER_OFF_LONG; > > > > - mmc_claim_host(host); > > - > > if (mmc_card_suspended(host->card)) > > goto out; > > > > @@ -2145,7 +2143,18 @@ static int _mmc_suspend(struct mmc_host > *host, bool is_suspend, > > mmc_card_set_suspended(host->card); > > } > > out: > > + return err; > > +} > > + > > +static int _mmc_suspend(struct mmc_host *host, bool is_suspend, > > + bool is_undervoltage) > > +{ > > + int err; > > + > > + mmc_claim_host(host); > > + err = __mmc_suspend(host, is_suspend, is_undervoltage); > > mmc_release_host(host); > > + > > return err; > > } > > > > @@ -2165,6 +2174,20 @@ static int mmc_suspend(struct mmc_host > *host) > > return err; > > } > > > > +static int __mmc_resume(struct mmc_host *host) { > > + int err; > > + > > + if (!mmc_card_suspended(host->card)) > > + return 0; > > + > > + mmc_power_up(host, host->card->ocr); > > + err = mmc_init_card(host, host->card->ocr, host->card); > > + mmc_card_clr_suspended(host->card); > > + > > + return err; > > +} > > + > > /* > > * This function tries to determine if the same card is still present > > * and, if so, restore all state to it. > > @@ -2174,16 +2197,9 @@ static int _mmc_resume(struct mmc_host > *host) > > int err = 0; > > > > mmc_claim_host(host); > > - > > - if (!mmc_card_suspended(host->card)) > > - goto out; > > - > > - mmc_power_up(host, host->card->ocr); > > - err = mmc_init_card(host, host->card->ocr, host->card); > > - mmc_card_clr_suspended(host->card); > > - > > -out: > > + err = __mmc_resume(host); > > mmc_release_host(host); > > + > > return err; > > } > > > > @@ -2194,6 +2210,13 @@ static int mmc_shutdown(struct mmc_host > *host) > > { > > int err = 0; > > > > + /* > > + * In case of undervoltage, the card will be powered off by > > + * _mmc_handle_undervoltage() > > + */ > > + if (host->undervoltage) > > + return 0; > > + > > /* > > * In a specific case for poweroff notify, we need to resume the card > > * before we can shutdown it properly. > > @@ -2285,6 +2308,71 @@ static int _mmc_hw_reset(struct mmc_host > *host) > > return mmc_init_card(host, card->ocr, card); } > > > > +/** > > + * _mmc_handle_undervoltage - Handle an undervoltage event for > > +MMC/eMMC devices > > + * @host: MMC host structure > > + * > > + * This function is triggered when an undervoltage condition is detected. > > + * It attempts to safely stop ongoing operations and transition the > > +device > > + * into a low-power or safe state to prevent data corruption. > > + * > > + * Steps performed: > > + * 1. If no card is present, return immediately. > > + * 2. Attempt to interrupt any ongoing operations using High Priority > Interrupt > > + * (HPI). > > + * 3. Perform an emergency suspend using EXT_CSD_POWER_OFF_SHORT if > possible. > > + * - If power-off notify is not supported, fallback mechanisms like sleep or > > + * deselecting the card are attempted. > > + * - Cache flushing is skipped to reduce execution time. > > + * 4. Mark the card as removed to prevent further interactions after > > + * undervoltage. > > The good path now looks like the best bet to me. > The PON fallbacks are still questionable IMO. > Deselect, if taking the spec seriously, cannot really give a hint to FTL. > While daisy-chaining MMC and SD is rather rare these days, that's still the > behavior vendors implement AFAIK. > Sleep with it's vendor-defined timeouts can be anything, so this could even > trigger a cache flush internally, after we avoided sending one because of > undervoltage. > If we were to rely on actually getting 100ms heads-up we could check that > against the timeout reported by the card, but IME those are wild guesses > rather than measured values. > > > + * > > + * Note: This function does not handle host claiming or releasing. The caller > > + * must ensure that the host is properly claimed before calling this > > + * function and released afterward. > > + * > > + * Returns: 0 on success, or a negative error code if any step fails. > > + */ > > +static int _mmc_handle_undervoltage(struct mmc_host *host) { > > + struct mmc_card *card = host->card; > > + int err = 0; > > + > > + /* If there is no card attached, nothing to do */ > > + if (!card) > > + return 0; > > + > > + /* > > + * Try to interrupt a long-running operation (such as an erase, trim, > > + * or write) using High Priority Interrupt (HPI). This helps ensure > > + * the card is in a safe state before power loss. > > I don't know about safe state before power loss, it's getting it in a state where > we can execute the below. > > > + */ > > + err = mmc_interrupt_hpi(card); > > + if (err) > > + pr_err("%s: Interrupt HPI failed, error %d\n", > > + mmc_hostname(host), err); > > + > > + /* > > + * Perform an emergency suspend to power off the eMMC quickly. > > + * This ensures the device enters a safe state before power is lost. > > + * We first attempt EXT_CSD_POWER_OFF_SHORT, but if power-off > notify > > + * is not supported, we fall back to sleep mode or deselecting the card. > > + * Cache flushing is skipped to minimize delay. > > + */ > > + err = __mmc_suspend(host, false, true); > > + if (err) > > + pr_err("%s: error %d doing suspend\n", > mmc_hostname(host), err); > > + > > + /* > > + * Mark the card as removed to prevent further operations. > > + * This ensures the system does not attempt to access the device > > + * after an undervoltage event, avoiding potential corruption. > > + */ > > + mmc_card_set_removed(card); > > + > > + return err; > > +} > > + > > static const struct mmc_bus_ops mmc_ops = { > > .remove = mmc_remove, > > .detect = mmc_detect, > > @@ -2297,6 +2385,7 @@ static const struct mmc_bus_ops mmc_ops = { > > .hw_reset = _mmc_hw_reset, > > .cache_enabled = _mmc_cache_enabled, > > .flush_cache = _mmc_flush_cache, > > + .handle_undervoltage = _mmc_handle_undervoltage, > > }; > > > > /* > > No need to resend anything at this point, I'm curious what others have to say > on the series.