On 3/18/24 23:21, Hans de Goede wrote: >>> But can we please drop the problematic quirk to lower the number >>> of ports for now, to avoid more people getting bitten by this >>> regression ? >> >> I am strongly against reverting that fix/improvement because of a problem with a >> badly broken hardware that does not respect the AHCI specifications. Such >> regression was bound to happen with such hardware and likely will happen again >> in the future if we touch anything that does not fit with the adapters weird use >> of PMP or other feature. I do not want libata code to be stuck as it is for fear >> of breaking support for adapters that are already broken in the first place... >> >> So let's go the other way around and add a libata.force parameter that allows >> disabling the port count fix, or allows specifying a port mask. That will allow >> users of these broken adapters to get them running again. Ideally, we would use >> a quirk but it seems that the same controller chip is used in both correct >> setups and broken-PMP setups. So unless ASMedia indicates some black-magic >> register we can look at to know what to do, it will have to be a "manual" module >> parameter. > > The kernel has a clear no regressions policy and there is ample documented > cases where needing to set a module option to undo the regression was > considered not acceptable. > > So there really is no discussion here. We must not regress and thus the default > behavior must be behavior which works out of the box on the boards with > the PMP chips on them. I am well aware of this policy and always work hard to not introduce any regression or to address them with the highest priority when they happen. It is however very unfortunate that such policy must be followed even when the regression is due to some bad hardware that does not correctly follow specifications and just happen to "work" by chance before a change. I do not feel that is right, especially considering that in this case, the revert will cause users with correct hardware to again see very long boot time (minutes order). > Also we really want Linux to "just work" having to set a module option > just to make things work very much goes against that. Sure, but then avoiding the long boot time will still need a module parameter to force applying a port mask. So it is one or the other here... I vote to support well good hardware. Anyway, I am not going to fight this and go with the revert for now. But if we do not get anything from ASMedia to help resolve this, these broken adapters are likely to cause issues again in the future. And because of that, I would very much like to just blacklist them unless someone write a special driver for these that does not pretend to be AHCI compliant. Because they are not. -- Damien Le Moal Western Digital Research