On 2/20/25 9:38 PM, Martin K. Petersen wrote:
John,
However, I agree it would be better to just fix the driver,
performance impact notwithstanding, and ship it. For my part I'd
rather have a correctly functioning driver, that's slower, but doesn't
panic.
I prefer to have a driver that doesn't panic when the user performs a
reasonably normal administrative action.
Agreed. The only clarification I want to make is that users will
not see a panic, they will see IO timeouts and Host bus resets.
It was my mistake to report earlier that the host would panic.
When aac_cpu_offline_feature is disabled users will see higher performance
but if they start off-lining CPUS they may see IO timeouts. This is the
state of the current driver and this is the problem which the original patch:
commit 9dc704dcc09e ("scsi: aacraid: Reply queue mapping to CPUs based on IRQ affinity")
was supposed to have fixed. The problem was the original patch didn't fix the
problem correctly and it resulted in the regression reported in Bugzilla 217599[1].
This patch circles back and fixes the original problem correctly. The extra
'aac_cpu_offline_feature' modparam was added to disable the new code path
because of concerns raised during our testing at Red Hat about reduced
performance with this patch.
If go-faster stripes are desired in specific configurations, then make
the performance mode an opt-in. Based on your benchmarks, however, I'm
not entirely convinced it's worth it...
I agree. So how about if we can just take out the aac_cpu_offline_feature modparam...?
Alternatively we can replace the modparam with a kConfig option. The default setting for
the new Kconfig option will be offline_cpu_support_on and performance_mode_off. That way
we can ship a default kernel configuration that provides a working aacraid driver which
safely supports off-lining CPUS. If people are really unhappy with the performance, and they
don't care about offline cpu support, they can re-config their kernel.
Personally I prefer option 1, but we the thoughts of the upstream users.
I've added the original authors of Bugzilla 217599[1] to the cc list to
get their attention and review.
/John
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217599