On 6/03/23 18:09, Ulf Hansson wrote: > On Fri, 3 Mar 2023 at 15:41, Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote: >> >> On 3/03/23 14:01, Ulf Hansson wrote: >>> On Fri, 3 Mar 2023 at 12:40, Christian Löhle <CLoehle@xxxxxxxxxxxxxx> wrote: >>>> >>>> >>>>>> >>>>>> REQ_FUA is in general supported for eMMC cards, which translates into so called "reliable writes". To support these write operations, the CMD23 (MMC_CAP_CMD23), needs to be supported by the mmc host too, which is common but not always the case. >>>>>> >>>>>> For some eMMC devices, it has been reported that reliable writes are quite costly, leading to performance degradations. >>>>>> >>>>>> In a way to improve the situation, let's avoid announcing REQ_FUA support if the eMMC supports an internal cache, as that allows us to rely solely on flush-requests (REQ_OP_FLUSH) instead, which seems to be a lot cheaper. >>>>>> Note that, those mmc hosts that lacks CMD23 support are already using this type of configuration, whatever that could mean. >>>>> >>>>> Just note that reliable write is strictly weaker than turning cache off/flushing, if card loses power during cache off/flush programming / busy, sector-wise atomicity is not mandated by the spec. >>>>> (And that is assuming cache off/flush is actually respected by the card as intended by the spec, should some cards be checked?) Maybe some FS people can also chime in? >>>> >>>> Nevermind, the sector-wise atomicity should not matter on 5.1 cards or if the block length isn't being played with, which it isn't in our case. >>>> If reliable write is implemented only according to spec, I don't see why the cache flushing should be less expensive, which would only make sense if >>>> a) < sector chunks are committed to flash >>>> b) reliable write is implemented much stricter than the spec, ensuring atomicity for the entire write. >>> >>> Right, I agree! >>> >>> Note 1) Reliable write was introduced way before cache management in >>> the eMMC spec. So, if the support for reliable write would have a >>> stricter implementation than needed, I would not be surprised. >> >> I am not sure when you say stricter than needed. Historically >> file systems assumed that sectors are updated atomically i.e. >> there is never a sector with a mixture of old and new data. >> The eMMC spec does not guarantee that, except for reliable >> write. > > Yes, I agree. With stricker, I was merely thinking of whether the eMMC > makes the entire write request (multiple sectors) being atomic, or the > guarantee is only per sector basis. > > According to the eMMC spec, that seems to be implementation specific. > One option could be heavier than the other, I guess. > >> >> File systems may use REQ_FUA for important information, like the >> superblock or a journal commit record, so using reliable write >> for REQ_FUA would seem to give better protection against file system >> corruption than a cache flush which could leave a sector >> half-written. > > Yes, I agree. If we should fully conform to what is stated in the eMMC > spec, we should probably keep the current path to support REQ_FUA. > >> >> On the other hand, sudden power loss is probably rare in battery >> powered systems because they are designed to monitor the battery >> power and shutdown when it gets too low. >> >> And file systems can use checksums to detect half-written updates. >> >> And there is anyway no protection for other (non REQ_FUA) writes a >> file system might do and expect not to tear sectors. >> >> And you are more likely to smash the screen than bounce the battery >> out and cause an unrecoverable file system error. > > Right, these are good arguments to why $subject patch perhaps makes > sense to move forward with anyway. Yes > > Moreover, it seems like some eMMC vendors don't really have any > concerns with us moving away from reliable writes, to instead use only > "cache flushing". I guess it can indicate that the regular writes may > already be treated in an atomic kind of way, but who knows. Indeed > >> >> Nevertheless, the commit message of this patch reads like the change >> is an optimization, whereas it seems more like a policy change. >> The commit message should perhaps say something like: >> "The consensus is that the benefit of improved performance by not >> using reliable-write for REQ_FUA is much greater than any potential >> benefit that reliable-write might provide to avoid file system >> corruption in the event of sudden power loss." > > I agree. I will improve it along the lines of what you suggest. > >> >> As for allowing for the policy to be overridden, perhaps an mmc_core >> module option? > > Even if I am not very fond of module parameters, this seems like a > reasonable thing to use for this case. > > I was also looking at using a card quirk. Yes that makes sense > >> >>> >>> Note 2) In the eMMC v5.1 spec, the cache flushing support has been >>> extended to allow an explicit barrier operation. Perhaps, we should >>> let that option take precedence over a regular flush+barrier, for >>> REQ_OP_FLUSH!? >>> >>>> >>>> I guess the cards which increase performance do b)? Or something else? >>> >>> That's the tricky part to know, as it's the internals of the eMMC. >> >> It is the natural conclusion though. The eMMC probably does not >> update mapping information with every write, instead if power is >> lost, it scans the updated areas at the next initialization. (The >> power-off notify feature would commit the mapping information to >> media to avoid that). So a reliable write might have to: >> 1. write information to record that the old mapping >> should be used, not what might be discovered by scanning >> 2. do the actual write >> 3. write mapping information to record the new mapping > > Yes. And depending on the eMMC device, some may be more clever than > others for how to actually deal with this. > >> >>> >>> Although, it seems like both Avri (WDC) and Bean (Micron) would be >>> happy to proceed with $subject patch, which makes me more comfortable >>> to move forward. >>> >>>> Anyway regarding FUA i don't have any concerns regarding reliability with cache flush. >>>> I can add some performance comparisons with some eMMCs I have around though. >>> >>> That would be great, thanks a lot for helping out with testing! >>> > > Kind regards > Uffe