On Mon, Jan 06, 2025 at 05:52:48PM +0530, Sibi Sankar wrote: > On 12/5/24 21:16, Johan Hovold wrote: > > As Marc said, it seems you need to come up with a way to detect and work > > around the broken firmware. > > The perf protocol version won't have any changes so detecting > it isn't possible :( But there could be other ways, see below. > > We want to get the fast channel issue fixed, but when we merge that fix > > it will trigger these crashes if we also merge cpufreq support for x1e. > > > > Can you expand the on the PERF_LEVEL_GET issue? Is it possible to > > implement some workaround for the buggy firmware? Like returning a dummy > > value? How exactly are things working today? Can't that be used a basis > > for a quirk? > > The main problem is the X1E firmware supports fast channel level get > but when queried it says it doesn't support it :|. The PERF_LEVEL_GET > regular messaging which gets used as a fallback has a bug which causes > the device to crash. So we either enable cpufreq only on platforms > that has the fix in place or live with the warning that certain messages > don't support fast channel which I don't think will fly. I've also been > told the crash wouldn't show up if we have all sleep states disabled. We certainly want cpufreq enabled also on the current/older firmware which have these bugs. Based on the above, it sounds like your fix: https://lore.kernel.org/lkml/20241030125512.2884761-2-quic_sibis@xxxxxxxxxxx/ is correct even if it triggers the crash on machines with buggy firmware. Why can't you add a quirk for x1e platforms that makes sure that the driver always uses fastchannel level get? You know it is supported (and as has to be used) even if the buggy firmware says it's not. Just set the corresponding attribute bit unconditionally based on the DT machine compatible (or fall back to the current implementation which theoretically other broken fw implementations may also be relying on), or similar. Johan