Am 31.10.2014 um 16:34 schrieb Richard Weinberger: > Hi Tanya, > > Am 31.10.2014 um 14:12 schrieb Tanya Brokhman: >> Hi Richard >> >> On 10/29/2014 2:00 PM, Richard Weinberger wrote: >>> Tanya, >>> >>> Am 29.10.2014 um 12:03 schrieb Tanya Brokhman: >>>> I'll try to address all you comments in one place. >>>> You're right that the read counters don't have to be exact but they do have to reflect the real state. >>> >>> But it does not really matter if the counters are a way to high or too low? >>> It does also not matter if a re-read of adjacent PEBs is issued too often. >>> It won't hurt. >>> >>>> Regarding your idea of saving them to a file, or somehow with userspace involved; This is doable, but such solution will depend on user space implementation: >>>> - one need to update kernel with correct read counters (saved somewhere in userspace) >>>> - it is required on every boot. >>>> - saving the counters back to userspace should be periodically triggered as well. >>>> So the minimal workflow for each boot life cycle will be: >>>> - on boot: update kernel with correct values from userspace >>> >>> Correct. >>> >>>> - kernel updates the counters on each read operation >>> >>> Yeah, that's a plain simple in kernel counter.. >>> >>>> - on powerdown: save the updated kernel counters back to userspace >>> >>> Correct. The counters can also be saved once a day by cron. >>> If one or two save operations are missed it won't hurt either. >>> >>>> The read-disturb handling is based on kernel updating and monitoring read counters. Taking this out of the kernel space will result in an incomplete and very fragile solution for >>>> the read-disturb problem since the dependency in userspace is just too big. >>> >>> Why? >>> We both agree on the fact that the counters don't have to be exact. >>> Maybe I'm wrong but to my understanding they are just a rough indicator that sometime later UBI has to check for bitrot/flips. >> >> The idea is to prevent data loss, to prevent errors while reading, because we might hit errors we can't fix. So although the read_disturb_threshold is a rough estimation based on >> statistics, we can't ignore it and need to stay close to the calculated statistics. >> >> Its really the same as wear-leveling. You have a limitation that each peb can be erased limited number of times. This erase-limit is also an estimation based on statistics >> collected by the card vendor. But you do want to know the exact number of erase counter to prevent erasing the block extensively. > > So you have to update the EC-Header every time we read a PEB...? > >> >>> >>>> Another issue to consider is that each SW upgrade will result in loosing the counters saved in userspace and reset all. Otherwise, system upgrade process will also have to be >>>> updated. >>> >>> Does it hurt if these counters are lost upon an upgrade? >>> Why do we need them for ever? >>> If they start after an upgrade from 0 again heavily read PEBs will quickly gain a high counter and will be checked. >> >> yes, we do need the ACCURATE counters and cant loose them. For example: we have a heavily read block. It was read from 100 times when the read-threshold is 101. Meaning, the 101 >> read will most probably fail. > > You are trying me to tell that the NAND is that crappy that it will die after 100 reads? I really hope this was just a bad example. > You *will* loose counters unless you update the EC-Header upon every read, which is also not sane at all. > >> You do a SW upgrade, and set the read-counter for this block as 0 and don't scrubb it. Next time you try reading from it (since it's heavily read from block), you'll get errors. If >> you're lucky, ecc will fx them for you, but its not guarantied. >> >>> >>> And of course these counters can be preserved. One can also place them into a UBI static volume. >>> Or use a sane upgrade process... >> >> "Sane upgrade" means that in order to support read-disturb we twist the users hand into implementing not a trivial logic in userspace. >> >>> >>> As I wrote in my last mail we could also create a new internal UBI volume to store these counters. >>> Then you can have the logic in kernel but don't have to change the UBI on-disk layout. >>> >>>> The read counters are very much like the ec counters used for wear-leveling; One is updated on each erase, other on each read; One is used to handle issues caused by frequent >>>> writes (erase operations), the other handle issues caused by frequent reads. >>>> So how are the two different? Why isn't wear-leveling (and erase counters) handled by userspace? My guess that the decision to encapsulate the wear-leveling into the kernel was due >>>> to the above mentioned reasons. >>> >>> The erase counters are crucial for UBI to operate. Even while booting up the kernel and mounting UBIFS the EC counters have to available >>> because UBI maybe needs to move LEBs around or has to find free PEBs which are not worn out. I UBI makes here a bad decision things will break. >> >> Same with read-counters and last_erase_timestamps. If ec counters are lost, we might get with bad blocks (since they are worn out) and have data loss. >> If we ignore read-disturb and don't' scrubb heavily read blocks we will have data loss as well. >> the only difference between the 2 scenarios is "how long before it happens". Read-disturb wasn't an issue since average lifespan of a nand device was ~5 years. Read-disturb occurs >> in a longer lifespan. that's why it's required now: a need for a "long life nand". > > Okay, read-disturb will only happen if you read blocks *very* often. Do you have numbers, datasheets, etc...? > > Let's recap. > > We need to address two issues: > a) If a PEB is ready very often we need to scrub it. > b) PEBs which are not read for a very long time need to be re-read/scrubbed to detect bit-rot > > Solving b) is easy, just re-read every PEB from time to time. No persistent data at all is needed. > To solve a) you suggest adding the read-counter to the UBI on-disk layout like the erase-counter values. > I don't think that this is a good solution. > We can perfectly fine save the read-counters from time to time and upon detach either to a file on UBIFS > or into a new internal value. As read-disturb will only happen after a long time and hence very high read-counters > it does not matter if we lose some values upon a powercut. i.e. Such that a counter is 50000 instead of 50500. > Btw: We also have to be very careful that reading data will not wear out the flash. > > So, we need a logic within UBI which counts every read access and persists this data in some way. > As suggested in an earlier mail this can also be done purely in userspace. > It can also be done within the UBI kernel module. I.e. by storing the counters into a internal volume. Another point: What if we scrub every PEB once a week? Why would that not work? Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html