On 25/11/2023 08:50, Oleksij Rempel wrote: > On Sat, Nov 25, 2023 at 06:51:55AM +0000, Greg Kroah-Hartman wrote: >> On Fri, Nov 24, 2023 at 07:57:25PM +0100, Oleksij Rempel wrote: >>> On Fri, Nov 24, 2023 at 05:26:30PM +0000, Greg Kroah-Hartman wrote: >>>> On Fri, Nov 24, 2023 at 05:32:34PM +0100, Oleksij Rempel wrote: >>>>> On Fri, Nov 24, 2023 at 03:56:19PM +0000, Greg Kroah-Hartman wrote: >>>>>> On Fri, Nov 24, 2023 at 03:49:46PM +0000, Mark Brown wrote: >>>>>>> On Fri, Nov 24, 2023 at 03:27:48PM +0000, Greg Kroah-Hartman wrote: >>>>>>>> On Fri, Nov 24, 2023 at 03:21:40PM +0000, Mark Brown wrote: >>>>>>> >>>>>>>>> This came out of some discussions about trying to handle emergency power >>>>>>>>> failure notifications. >>>>>>> >>>>>>>> I'm sorry, but I don't know what that means. Are you saying that the >>>>>>>> kernel is now going to try to provide a hard guarantee that some devices >>>>>>>> are going to be shut down in X number of seconds when asked? If so, why >>>>>>>> not do this in userspace? >>>>>>> >>>>>>> No, it was initially (or when I initially saw it anyway) handling of >>>>>>> notifications from regulators that they're in trouble and we have some >>>>>>> small amount of time to do anything we might want to do about it before >>>>>>> we expire. >>>>>> >>>>>> So we are going to guarantee a "time" in which we are going to do >>>>>> something? Again, if that's required, why not do it in userspace using >>>>>> a RT kernel? >>>>> >>>>> For the HW in question I have only 100ms time before power loss. By >>>>> doing it over use space some we will have even less time to react. >>>> >>>> Why can't userspace react that fast? Why will the kernel be somehow >>>> faster? Speed should be the same, just get the "power is cut" signal >>>> and have userspace flush and unmount the disk before power is gone. Why >>>> can the kernel do this any differently? >>>> >>>>> In fact, this is not a new requirement. It exist on different flavors of >>>>> automotive Linux for about 10 years. Linux in cars should be able to >>>>> handle voltage drops for example on ignition and so on. The only new thing is >>>>> the attempt to mainline it. >>>> >>>> But your patch is not guaranteeing anything, it's just doing a "I want >>>> this done before the other devices are handled", that's it. There is no >>>> chance that 100ms is going to be a requirement, or that some other >>>> device type is not going to come along and demand to be ahead of your >>>> device in the list. >>>> >>>> So you are going to have a constant fight among device types over the >>>> years, and people complaining that the kernel is now somehow going to >>>> guarantee that a device is shutdown in a set amount of time, which >>>> again, the kernel can not guarantee here. >>>> >>>> This might work as a one-off for a specific hardware platform, which is >>>> odd, but not anything you really should be adding for anyone else to use >>>> here as your reasoning for it does not reflect what the code does. >>> >>> I see. Good point. >>> >>> In my case umount is not needed, there is not enough time to write down >>> the data. We should send a shutdown command to the eMMC ASAP. >> >> If you don't care about the data, why is a shutdown command to the >> hardware needed? What does that do that makes anything "safe" if your >> data is lost. > > It prevents HW damage. In a typical automotive under-voltage labor it is > usually possible to reproduce X amount of bricked eMMCs or NANDs on Y > amount of under-voltage cycles (I do not have exact numbers right now). > Even if the numbers not so high in the labor tests (sometimes something > like one bricked device in a month of tests), the field returns are > significant enough to care about software solution for this problem. > > Same problem was seen not only in automotive devices, but also in > industrial or agricultural. With other words, it is important enough to bring > some kind of solution mainline. > IMO that is a serious problem with the used storage / eMMC in that case and it is not suitable for industrial/automotive uses? Any industrial/automotive-suitable storage device should detect under-voltage and just treat it as a power-down/loss, and while that isn't nice for the storage device, it really shouldn't be able to brick a device (within <1M cycles anyway). What does the storage module vendor say about this? BR, Christian